Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
That is part of the 'careful design' I was referring to. There is also an issue of route congestion, which the current autorouters do not handle well in high performance designs. Hopefully, the selects can be registered to cut down on the timing problems there. At that high data rate, one normally looks for ways to increase the permissible pipelining in an FPGA. There really are not all that many scenarios where pipelining can't be applied. Most of the time, these occur when there is a tight feedback loop, which in the case of a multiplexer there probably should not be. Also, it occurs to me that he may be looking for a combinatorial multiplexer with the inputs and outputs going off-chip, in which case the FPGA is not anywhere close to fast enough. Uwe Bonnes wrote: > : In comp.lang.vhdl Ray Andraka <ray@andraka.com> wrote: WIth careful design > : and floorplanning, in a -6 part it should be, but you also need to consider > : the timing in and out of it. Don't count on the automatic place and route > : to get a minimal delay solution: you may find you need to do some hand > : routing on this. I suggest you try it out with a one or 2 bit version first > : and see if you can make your requirements. > > What about the select signals. With a large multiplexer for a wide bus, the > load for these signals becomes large. Buffering these signal in a sensible > way is probably needed. Has the user to care in his code or does place and > route? > > Bye > > -- > Uwe Bonnes bon@elektron.ikp.physik.tu-darmstadt.de > > Institut fuer Kernphysik Schlossgartenstrasse 9 64289 Darmstadt > --------- Tel. 06151 162516 -------- Fax. 06151 164321 ---------- -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 47751
Hi! I wrote to the Xilinx Support, they gave me the following example, possibly it might help in the future: ##start Below, you will find the format of the input and output and how does this relate to the square root. The inputs and outputs to the CORDIC core are Q1 format binary numbers. Some examples of the Q1 number format is given below; For a 8 bit input the format of the input is 0.0001000 = (SQRT 1/16) and the format of the output is 0.0100000 = (1/4) For a 20 bit input the format of the input is 0.0001000000000000000 = (SQRT 1/16) and the format of the output is 0.0100000000000000000 = (1/4) The input/output of the SQRT can be interpreted differently to "change" the input range. If the input data is left shifted by 2*N bits the output data is left shifted by N bits. The 8 bit input to the Square Root. 0000100.0 = (SQRT 4) 0010.0000 = (2) A 20 bit input to the Square Root. 0000000000000000010.0 = SQRT(2) 0000000001.0110101000 = 1.4140625 A 20 bit input to the Square Root. 0000000000000000100.0 = SQRT(4) 0000000010.0000000000 = 2 Here is a quick summary of how to instantiate an integer in and integer out CORDIC Square Root. X_IN : STD_LOGIC_VECTOR(20-1 DOWNTO 0); X_OUT : STD_LOGIC_VECTOR(20-1 DOWNTO 0); Instantiate a 21 bit square root: X_IN_SQRT : STD_LOGIC_VECTOR(21-1 DOWNTO 0); X_OUT_SQRT : STD_LOGIC_VECTOR(21-1 DOWNTO 0); Make the following assignments: Inputs: X_IN_SQRT <= X_IN & '0'; Outputs: X_OUT(20-1 DOWNTO 11) <= (OTHERS=>'0'); X_OUT(10 DOWN TO 0) <= X_OUT_SQRT(21-1 DOWN to 10) ##end Thomas WamberaArticle: 47752
Tcl interpreter uses a bytecode compiler internally to improve performance. That bytecode is not suitable for hardware implementation, however. We designed a special bytecode for this purpose, and yes, the FPGA is like a virtual machine for this bytecode. --Scott "Phil Tomson" <ptkwt@shell1.aracnet.com> wrote in message news:anf7en0ct1@enews3.newsguy.com... > In article <uplqnffa9droca@corp.supernews.com>, > Scott Thibault <thibault@gmvhdl.com> wrote: > >AcroDesign Technologies has announced results from its work on an embedded > >processor for the Tcl language. More information, and a recent presentation > >is available at: http://www.gmvhdl.com/acrodesign/research.html#tob > > > >--Scott Thibault > >AcroDesign Technologies > > Hmmmm.... so this is basically a FPGA implementation of the Tcl virtual > machine? (actually, I wasn't aware that Tcl had a bytecode interpreter, > but I guess it isn't surprising). > > PhilArticle: 47753
"Nicholas C. Weaver" <nweaver@ribbit.CS.Berkeley.EDU> wrote in message news:anf99u$2h35$1@agate.berkeley.edu... > In article <anf7en0ct1@enews3.newsguy.com>, > Phil Tomson <ptkwt@shell1.aracnet.com> wrote: > >Hmmmm.... so this is basically a FPGA implementation of the Tcl virtual > >machine? (actually, I wasn't aware that Tcl had a bytecode interpreter, > >but I guess it isn't surprising). > > Well, TCL is more a ascii string munging hack, based around textual > replacement, so you could treat the program as a (really ugly) > bytecode. Tcl was originally implemented as a string processing engine, but can be described in more traditional terms just as other languages (i.e., a BNF grammer, compilers, etc.) > > Why you would WANT to however, is beyond me. Compile scheme or > something to a nice vanilla uP core and have hardware support for > garbage collection. > Tcl is not so different from Scheme if you replace []'s with ()'s, but ... there are a couple of advantages to using Tcl. First, memory is managed with reference counting, which is simple and predictable. Second, the extensive built-in string processing abilities that very are useful for embedded devices that communicate using command based protocols over TCP. There are other advantages such as being easy to learn and pointer safe etc. We compiled to a custom processor because compiling very high-level languages to a vanilla processor can generate large executables, and our target was a memory constrained device. --ScottArticle: 47754
Hal Murray wrote: > [snip description of magic-box gizmo] > > Sounds like a fun project. > > I'd suggest that you look at your current collection of circuits your > users are using and try implementing them with various proposed IO > connections/mappings. I have to get the thing installed ASAP, because a few people are clamoring to fix crosstalk, and to get more n-input gates. So I want to avoid trying to route all the things I might want to try. Especially considering that some of the other functions involve one-shots. Some of those might work out Ok in the CPLD with digital delays, but will require some thought to convert from the mixed signal to the strictly digital domain. Thus, my plan has been to use a CPLD that is far in excess of the typical logical complexity that is needed, and as long as the thing lets me route in a mostly unconstrained manner, then I anticipate it being very rare that anything would be needed that exceeds the capabability of the CPLD. Note the context here: most of the circuits currently implemented are just n-input AND and OR gates, with a few cases of combined boolean product and sum functions. Thus, this thing is highly overkill. But the point is that it is 2002, so what should I put in there? To put in a CPLD vs. discrete logic packages doesn't change the cost equation at all, since for something like this the labor far exceeds the hardware cost. Thus, the best thing to do is to implement the most flexible thing that is within reason. An FPGA is overkill, but a mid-range CPLD seems just right. The CPLD seems better than an FPGA anyway, because this is a combinatorial heavy, as opposed to register heavy application. If you can call a handful of ANDs and ORs heavy at all! > How are you going to program the CPLD? Perhaps another small > connector on your daughter card so you can plug in a PC/laptop. Indeed, I will put a female DB9 on the front panel, to make it easy to plug in a programmer at any time and change the config. > Are you expecting to do a lot of programming on the fly, or mostly > use a handful or normal/popular boxes? How are you going to test > things? Infrequent reprogramming. The boxes will be delivered with a "default" configuration of a basic assortment of AND, OR, NAND, NOR, looking similar to the existing magic boxen. Later I can modify the progamming to suit the specializations that are in place in the various labs, as well as start implementing some of the "more complicated" functions that I would otherwise have implemented in a custom peice of hard-wired hardware. Some of those special functions will ultimately become dedicated custom hardware pieces anyway in the future. But in the short term, if I can do it on the nice looking CPLD panel, it will be much better that what I do now, which is to install a breadboard with a debugged prototype circuit, sitting on a table where people can bump into it and accidentally yank the wires out. It will sit there for about a year typically before I have a chance to come back and convert it into a PCB in a permanent chassis. Thus, I hope to avoid the crudeness factor of the temporary prototype hardware that I have sitting around in various labs waiting to be made permanent. Of course, the CPLD doesn't help me much for analog problems, but usually I have very similar overall circuits: a few analog inputs get conditioned, fed to comparators, and logic-ed with some other digital signals, then spit out some digital results. So perhaps the next step will be to build some general purpose analog building blocks. > Are you going to program them all or will you setup a system so your > users can program their own boxes? Both. They will have a front panel JTAG port as I mentioned. I may construct a little computer cart to wheel around and program the things when needed. Or I may just install the Xilinx software on one of small army of PCs that typically populate the labs, and do the programming from their. Whether the users learn to program them I am not sure if that will happen. Unlikely, I suspect. That's why they hired me, and they are all mechanical engineers anyway, though exceptionally bright ones. > Are you going to leave enough room on the front to attach a drawing > of the circuit diagram? Maybe the drawing should slip over the BNC > connectors so it's really obvious which gate is connected to which > connector. Ah, this is a good question. I am leaning toward a "Battleship" appearance right now, with silk screened row and column labels on the panel. When the programming is done, the schematic can be printed, and that will serve as a map. Though it would be nice to have something more specific on the panel. I had envisioned slipping a chart over the BNCs as you mentioned. Not sure about this. > Are the in/out LEDs really necessary of you have a good drawing? The LEDs aren't absolutely necessary, but not that much trouble to include. They will benefit those situations where one might wonder if a signal is present or not, without having to connect a scope. Most of our signals are slow. We are controlling big diesel engines. But there are some time critical laser and camera sync pulses going around. Not high frequency, just relative timing. > How many are you going to make? Would it be simpler/cheaper to dump > the LEDs, switches, and unused IO gear and always put the output > connectors on the top (or someplace distinctive)? I had first considered making the allocation of inputs and outputs fixed. But my considerations of the cost of putting in the added flexibility led to the conclusion that it was worth it to have the selectable IOs, and the LEDs. I will make between 3 and perhaps 10 of these things. The added hardware might increase the total assembly time of each one by 10-20%. This is reasonable. > If you have more IO pins than connectors, can you parallel several of > them and get rid of your 74ACTQ14 output buffer chips? (Might be > ugly to program.) I like having the buffer chips in between the CPLD and the user. That way it is likely that if they break something, it will only be an individual channel. Also, I want to be sure that if they connect say, 16 of the outputs to actual 50 ohm loads, which my design is capable of handling, that it won't break. Thus, I paid careful attention to things like the maximum allowable DC current per IC package, etc. If the CPLD drives the outputs directly, I suspect it won't be happy with having many 100 ohm loads attached (the 50ohm cable terminations, in series with the 50ohm back terminations). As it stands, I can drive unterminated as well as terminated 50ohm cables, as many as 32 terminated cables if I really want to, for a total output of 1.6A without any risk to the CPLD. And there will be little ringing of edges with or without terminations, due to the back terminations. The nice thing about ACTQ or similar AC drivers, is that by paralleling several of them, you lower the non-linearity of the output impedance, so that you can concentrate most of the output impedance in the purely resistive series resistor. And the ACTQs can handle a direct short circuit on the output, with the series resistor, indefinitely. Plus there is enough current capacity left over to drive the LED associated with each BNC. Oh, one other important thing! The CPLD runs at 3.3V, so I need level shifting anyway. Thanks for your interest. Good day! -- ____________________________________ Christopher R. Carlen Principal Laser/Optical Technologist Sandia National Laboratories CA USA crcarle@sandia.govArticle: 47755
In article <3D9C4B49.9832D9B2@wambera.de>, Thomas Wambera <thomas@wambera.de> writes >I wrote to the Xilinx Support, they gave me the following example, >possibly it might help in the future: <soln snipped> Thanks for the feedback, I looked at your problem but could not see a clear solution, your answer is now bookmarked :) -- fredArticle: 47756
check the xilinx CPLD app note xapp 346 for some starters. http://www.xilinx.com/apps/epld.htm Steve skillwood wrote: > Hi all, > Can some one give me an introduction to low power SoC design . What is > difference from an ordinary design and low power design in the design stage > . Suppose I am designing a fsm based sequential logic , at which stage the > "LOW POWER " Comes in . > > thanks > skillieArticle: 47757
Hi, I think this is really a case of where to draw the line between SW and HW implementations. There are certain very mundane things that C code running on a general purpose processor is much better at... sure, you can do it in hardware, but you're burning gates very quickly. OTOH, if a SW implementation isn't cuttinng it in terms of performance, you can convert specific areas of a TCP/IP stack to hardware. Two such examples are any data shoveling present in your stack that copies data as the packet is being assembled - use DMA instead. Another is the TCP checksum, something that will take a few thousand cycles on a normal RISC CPU (for a large IP packet)... this can be taken down to *tens* of cycles by converting a few lines of C code to a few lines of HDL..... I guess what I'm getting at is that C -> all gates in the system doesn't make much sense. Select areas of C that are slow -> gates does make sense. This is straying from topic, but looking "down" a couple layers, there are also Ethernet MAC cores (such as the opencores ethmac project) which include their own DMA engine that masters memory. The host CPU, or logic, or whatever you're interfacing to assembles a packet and passes its location and a 'go' bit to the MAC core... it does the rest. It takes a bit of effort, but its not very difficult to wire up such a MAC to a processor core such as Nios (myself and a few other people who have posted to the opencores site have done so). - Jesse > FPGA TCP > On the fpga-cpu list, Anand Gopal Shirahatti asks: > "... What I was wondering is, are there are Implementations of the > TCP/IP Implementation over a Single FPGA, for mutilple connections. ..." > The simplest thing to do is run a software TCP/IP stack on a soft CPU > core. For example, at ESC I saw TCP/IP running on uCLinux on Altera Nios > with a CS8900A ethernet MAC. > > Note that a compact FPGA CPU core with integral DMA (e.g. xr16) may be > hybridized into the data shovel aspect of an ethernet MAC. (Flexibly > shovel the incoming bits to/from buffers, etc.) Indeed, one enhanced > FPGA CPU might (time multiplexed or otherwise) manage several physical > links. > > You can also build hardware implementations of the TCP/IP protocol > itself. There are several such implementations in custom VLSI. For FPGA > approaches, see: > > * Smith et al's XCoNet. > > * BlueArc SiliconServer white paper. > "The SiliconServer runs all normal TCP/IP functionality in state > machine logic with a few exceptions that are currently dealt with by > software running on the systems attached processor (e.g. ICMP traffic, > fragmented traffic reassembly)." > > And related things: FPXKSMArticle: 47758
Chris, If you had asked this question about a year or two ago I would say you should seriously consider using a Spartan (5 volt compatibility) Given the low cost of decent FPGAs I personally would still use an FPGA. Use a reprogrammable serial prom to hold the code. Given the capacity of MEs to short or otherwise mis-connect outputs and inputs, I would condsider something in a socketable package for the FPGA (or, if you decide differently, the CPLD.) The reason I suggest the FPGA is that from personal experience, the project can quickly grow all out of proportion. Here is an idea that would perhaps really simplify the job for you... What about one of the demo boards from one of the distributors to do the job for you. Typically they have at least one clock input with an on-board oscillator(in case you need a digital one-shot or two) total cost would be about $200 at most. One source that I have seen that has impressed me is Insight Electronics. I have seen their boards and they are pretty good quality. http://www.insight-electronics.com and click on the Xilinx development kit window. I see they have a kit for the coolrunner XPLA3 for $125. I don't work for them or anything, it just seems that it is a waste of time to re-invent the wheel. Good luck, Theron "Christopher R. Carlen" <crcarle@sandia.gov> wrote in message news:3D9B8FDE.8000405@sandia.gov... > Hi folks: > > (Please skim down to "Question:" if you don't want to read the details...) > > In our engine labs we have a "magic box" which is a chassis with a panel > covered with BNCs, connected to the inputs and outputs of a variety of > basic logic gates (AND, OR, etc). This magic box is used to implement > various glue logic functions for our research engine control, > experimental apparatus, and data acquisition control schemes. > > There are two problems with the existing magic boxes: 1. They have > terrible crosstalk problems, since they were done with wire wrapping and > not much thought to the existance of such things as mutual inductance > and capacitance. 2. The scientists tend to use up a lot of one type of > gate, leaving the others unused. Then they come to me and say "I ran > out of AND gates" or "I need a 6-input OR gate, can you modify the magic > box?" They also come to me periodically asking for me to implement > various logical gizmos of somewhat greater complexity than the magic box > can handle, requiring the design of custom hardware. > > Rather than get my name associated with the poorly functioning device > after performing mods, and rather than waste my time futzing with the > wiring of the thing to add more gates only to have to make another > hardware change in a few more months, I decided to get modern: > > I plan to build a universal magic box with a Xilinx Coolrunner XPLA3 > XCR3128XL-VQ100 CPLD device. This seemed like a good way to start > learning the ropes with PLDs, which I've been eager to do for some time. > My box will have nice ESD protection networks on each of an array of > 32 BNC connectors. Each connector can be changed from a Schmitt trigger > input buffer, or to a 50R back terminated 3x paralleled 74ACTQ14 output > buffer, by switching a little DPDT switch (on the inside of the > chassis). A bi-color LED will go with each connector, and will glow red > for outputs active, and green for inputs active. > > Most of the space on the new magic box PCB, which will fit directly into > the panel so I don't have to run any wires from the connectors to > anywhere, is consumed by the IO buffering, switches, and LEDs. So I > plan to fit the CPLD on a little daughterboard that will plug into the > main board, kind of like a giant DIP package. The CPLD daughterboard > will have available 40 IOs, the 4 global clock inputs, the JTAG signals, > and will have an on-board 3.3V regulator. > > Everything is pretty well thought out so far, I think. The only problem > is, the CPLD has 84 IOs available, of which I plan to use up to 40. 32 > IOs will be connected to the user BNCs for certain, and I will have a > little header on the main PCB for expansion to another 8 user IOs if > needed in the future, as well as a header for access to the global > clocks, which will also be jumper selectable to connect to four of the > BNCs, if desired. > > Enough bells and whistles? ;-) I hope the CPLD will allow me to > reconfigure the logic available to the user on the fly, and even to > implement those "more complicated than just a few gates" functions that > get asked for now and then, without having to build a new physical > circuit prototype breadboard and PCB. > > Question: > > How should I map the user IOs to the CPLD IOs, ie. function blocks and > macrocells, so as to result in the greatest likelyhood of always being > able to route whatever functions I want, to the pins I choose? > > There seem to be two possible approaches: 1. Take a few IOs from each > function block, so that all function blocks are ultimately represented > to the outside (example: take 5 random macrocells from each of the 8 > function blocks, for 40 IOs). Or 2. Use all the macrocells of the > first few function blocks until my 40 IOs are mapped out, then leave the > rest of the function blocks available for internal-only routing > (example, take all 16 macrocells from the first 2 function blocks, plus > 8 macrocells from the 3rd function block, leaving 5.5 function blocks > not connected to the outside). > > Any suggestions? > > Is this a wierd problem? > > Thanks for comments! > > Good day. > > > > -- > ____________________________________ > Christopher R. Carlen > Principal Laser/Optical Technologist > Sandia National Laboratories CA USA > crcarle@sandia.gov >Article: 47759
Hi, I am going to implement a small MAC (multiply-accumulation) unit on FPGA.But I can't find any detailed information on its architecture.All the architechtures on the papers are very complicated.I need an easy,small implementation.Does anyone know any materials describing it? Thank you very much! sincerely ------------- Kuan Zhou ECSE departmentArticle: 47760
In article <Pine.SOL.3.96.1021003130537.25536A-100000@vcmr-86.server.rpi.edu>, Kuan Zhou <zhouk@rpi.edu> wrote: >Hi, > I am going to implement a small MAC (multiply-accumulation) unit on >FPGA.But I can't find any detailed information on its architecture.All the >architechtures on the papers are very complicated.I need an easy,small >implementation.Does anyone know any materials describing it? Be more specific: What FPGA? What design criteria? Functionality? Low latency? High performance with pipelining? What class is this homework for? -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 47761
This is a bit far fetched, but might work very nicely for your application. Have you thought of using DVI I/O chips? DVI is a relatively recent connectivity methodology for computer displays. It is, in escence, serialized 8 bit RGB. A single link can deliver in the order of 5 or 6 Gigabits per second, if I recall. The chips (both TX and RX) are less than ten bucks a piece. You can certainly clock DVI at less then the max single link 165 MHz rate and transport your data to via a serial link. I think the chips will go down to 25 MHz clocking. At low data rates you can probably go many more feet than the standard provides for. Heck, you could have three redundant links delivered over a commodity cable. Anyhow, just a thought. Check out the Silicon Image site for more details: http://www.siimage.com/home.asp HTH, -- Martin Euredjian To send private email: 0_0_0_0_@pacbell.net where "0_0_0_0_" = "martineu" "Theron Hicks" <hicksthe@egr.msu.edu> wrote in message news:anctr2$2brl$1@msunews.cl.msu.edu... > Hello, > I am developing an instrument that is currently communicating over a > special high speed parallel board. The data rate is 6.4 million 8 bit words > per second. The board works great but it costs in excess of $1600 US per > copy. It also occupies a full sized PCI slot. We are considering > implementing an alternative I/O arrangement such as USB2 or ethernet > (TCP/IP). Is anyone aware of free-ware USB2 implemented in VHDL or some > other FPGA friendly technology? Note: target FPGA is a Spartan2E (or if > absolutely necessary, Virtex2). > > Thanks, > Theron > >Article: 47762
"Christopher R. Carlen" <crcarle@sandia.gov> wrote in message news:3D9C5E6E.5060500@sandia.gov... <snip> > Oh, one other important thing! The CPLD runs at 3.3V, so I need level > shifting anyway. > > Thanks for your interest. > > Good day! > If you're using buffers you might want to consider using 74LVC4245A level shifting buffers from Philips, that way you can drive from the PLD 3.3V logic and be able to select 3.3V or 5V CMOS levels on your outputs. http://www.philipslogic.com/products/lvc/pdf/74lvc4245a.pdf Mark.Article: 47763
Another point on the curve is to add hardware functional units to your soft processor core to do the expensive inner loop computations (e.g. the packet checksum example) in a dedicated datapath, either explicitly, or as a side effect of loading/storing the data. See also a recent article: Jesse Kempa, Altera, at ChipCenter: Maximizing Embedded System Performance in the Era of Programmable Logic, (http://www.chipcenter.com/pld/images/pldf097.pdf). A very nice article, based upon the task of speeding up a Nios SoC-based HTTP server, illustrating that creative application of programmable logic can deliver big speedups over a pure software approach. Also: "Jesse Kempa" <kempaj@yahoo.com> wrote > ... This is straying from topic, but looking "down" a couple layers, there > are also Ethernet MAC cores (such as the opencores ethmac project) > which include their own DMA engine that masters memory. The host CPU, > or logic, or whatever you're interfacing to assembles a packet and > passes its location and a 'go' bit to the MAC core... As I originallhy wrote (http://www.fpgacpu.org/log/apr02.html#020405): > > Note that a compact FPGA CPU core with integral DMA (e.g. xr16) may be > > hybridized into the data shovel aspect of an ethernet MAC. (Flexibly > > shovel the incoming bits to/from buffers, etc.) Indeed, one enhanced > > FPGA CPU might (time multiplexed or otherwise) manage several physical > > links. Elaboration: If you think about it, a processor datapath is already superbly outfitted to implement DMA. Fetching sequential instructions is equivalent to DMA, and branches/jumps are equivalent to loading the next DMA transfer address. In the xr16 design, I replaced a single PC register with a 16-entry "PC register file" that makes it easy to run multiple threads or do multiple channels of DMA. Datapath cost (n-bit wide datapath): +n LUTs, -n FFs. This idea can save you any number of DMA address counter(s), address mux(es), and the address mux delay(s), elsewhere in the design.. In the XSOC/xr16 Kit, I used one hardwired channel of DMA to stream in video data from external RAM. (Necessity (shoehorning the processor, video controller, and rest of the SoC, into a '4005) being the mother of invention.) The next step (not taken in xr16) is to add instructions to programmatically schedule these DMA operations -- addresses, counts, arbitration. Thus it seems to me that an enhanced { 200 LUT, 1 BRAM } 16-bit RISC soft core could make a pretty capable MAC, and as a bonus you can build a (software) TCP/IP engine for zero additional LUTs. Jan Gray, Gray Research LLCArticle: 47764
In response to a posting by "Jesse Kempa" <kempaj@yahoo.com>, I wrote > See also a recent article: > Jesse Kempa, Altera, at ChipCenter: Maximizing Embedded System Performance > in the Era of Programmable Logic, > (http://www.chipcenter.com/pld/images/pldf097.pdf). ... Oops, sorry, I knew that name seemed familiar. Nice article though. Jan Gray, Gray Research LLCArticle: 47765
I now emulate ISA bus model and my core design on the FLEX 10k. PC can communicate with my core design for large testbench. Now the FLEX 10k act as a I/O card device. It is assigned with a IRQ and a segment of I/O port address. ISA I/O device must be initialized at 'power on' of motherboard. And the OS will load my device driver when booting. The problem is that I use ByteBlaster(LPT) to download programming data of FPGA. So I must boot twice, one for programming FPGA as a ISA I/O device(contian ISA bus model and my core design) and the other for initialing ISA I/O device and loading my device driver. Does I can program FPGA without PC and ready the ISA I/O device first, then power on the PC? Which programming method sould I select?Article: 47766
Ken Smith wrote: > In article <3D9B8FDE.8000405@sandia.gov>, > Christopher R. Carlen <crcarle@sandia.gov> wrote: > [....] > >>Everything is pretty well thought out so far, I think. The only problem >>is, the CPLD has 84 IOs available, of which I plan to use up to 40. 32 > > > > I suggest you spread the I/O connection among the logic blocks in logical > groups. > > If you are fairly certain the other lines will not be needed, you can hook > pairs of them together so that you have another way to get signals between > logic blocks or you can wire up a socket for a 22V10 or spare I/O. This > will help keep your options open. Perhaps it will suffice to bring the unused IOs out to vias. Then I can tie them together later if need be. It may even be less trouble to build a new daughterboard for the XCR3256 chip later on. But the XCR3128 seemed like it would give a lot of room for growth over the XCR3064 which I originally considered. But the main problem their was having to share the JTAG pins, which I wanted to avoid for simplicity. -- ____________________________________ Christopher R. Carlen Principal Laser/Optical Technologist Sandia National Laboratories CA USA crcarle@sandia.govArticle: 47767
Theron Hicks wrote: > Chris, > > If you had asked this question about a year or two ago I would say you > should seriously consider using a Spartan (5 volt compatibility) Given the > low cost of decent FPGAs I personally would still use an FPGA. Use a > reprogrammable serial prom to hold the code. Given the capacity of MEs to > short or otherwise mis-connect outputs and inputs, I would condsider > something in a socketable package for the FPGA (or, if you decide > differently, the CPLD.) The reason I suggest the FPGA is that from personal > experience, the project can quickly grow all out of proportion. Thanks for your input. Egad, an FPGA just isn't necessary. Remember, the existing functionality utilizes about 5-10 2-input gates. And the Coolrunner XPLA3 has 5V tolerant IO. > > Here is an idea that would perhaps really simplify the job for you... What > about one of the demo boards from one of the distributors to do the job for > you. Typically they have at least one clock input with an on-board > oscillator(in case you need a digital one-shot or two) total cost would be > about $200 at most. One source that I have seen that has impressed me is > Insight Electronics. I have seen their boards and they are pretty good > quality. Yes I am aware of that. Unfortunately, I tend to like to make my own boards, because I am very fussy about having control over every little aspect of the circuitry. In fact I just completed a CPLD dev. board that took several months of after work hours at home to design. But it is *so* flexible and just the way I like it that I have no regrets about spending the time rather than buying one off the shelf. Good day! -- ____________________________________ Christopher R. Carlen Principal Laser/Optical Technologist Sandia National Laboratories CA USA crcarle@sandia.govArticle: 47768
markp wrote: > "Christopher R. Carlen" <crcarle@sandia.gov> wrote in message > news:3D9C5E6E.5060500@sandia.gov... > <snip> > >>Oh, one other important thing! The CPLD runs at 3.3V, so I need level >>shifting anyway. >> >>Thanks for your interest. >> >>Good day! >> > > > If you're using buffers you might want to consider using 74LVC4245A level > shifting buffers from Philips, that way you can drive from the PLD 3.3V > logic and be able to select 3.3V or 5V CMOS levels on your outputs. > > http://www.philipslogic.com/products/lvc/pdf/74lvc4245a.pdf > > Mark. > > Hmm, in this case there is zero likelyhood of having to output 3.3V levels. What is the likelyhood that commercial instruments in the next few years will have user inputs that are 3.3V level instead of 5V TTL compatible levels? There are so many logic levels these days, it makes sense to keep the external world interface standardized on one thing, so all instruments can talk to each other. 5V works for me. I hope it stays that way. Thanks for your input! -- ____________________________________ Christopher R. Carlen Principal Laser/Optical Technologist Sandia National Laboratories CA USA crcarle@sandia.govArticle: 47769
Ru-Chin Tsai wrote: > > I now emulate ISA bus model and my core design on the FLEX 10k. PC can > communicate with my core design for large testbench. Now the FLEX 10k > act as a I/O card device. It is assigned with a IRQ and a segment of > I/O port address. ISA I/O device must be initialized at 'power on' of > motherboard. And the OS will load my device driver when booting. The > problem is that I use ByteBlaster(LPT) to download programming data of > FPGA. So I must boot twice, one for programming FPGA as a ISA I/O > device(contian ISA bus model and my core design) and the other for > initialing ISA I/O device and loading my device driver. Does I can > program FPGA without PC and ready the ISA I/O device first, then power > on the PC? Which programming method sould I select? Your request is not completely clear to me, but I think you are looking for a way to automatically load the FPGA on boot up. If your design is a little more stable you can wire a serial EEPROM onto the board and the FPGA will load directly from that. There are app notes on this at the Altera web site. Atmel makes some nice reprogrammable parts for this. I believe one or the other site even has plans on how to connect a serial memory along with a cable to allow you to reprogram the EEPROM or the FPGA, your choice, IIRC. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 47770
Christopher R. Carlen wrote: > > > Oh, one other important thing! The CPLD runs at 3.3V, so I need level > shifting anyway. Given the style of this project, you should perhaps look at a 5V PLCC device. ( eg Atmel ATF1508ASL ) That way users can replace it if it gets damaged, or you could even use a ZIF socket, they can have their own chips. ( Note : JTAG has a finite cycle count, so many changes will 'wear out' the chip - and they might prefer to 'pick a chip' over find a file.. ) ATF1508ASL quotes > 10,000 (re)pgm cycles, XC2 quotes 1,000 (re)pgm cycles, MAX3000A quote > 100 (re)pgm cycles, If the functions are simply 'mostly AND, some OR', then almost any PLD will handle that - your only ceiling is Fan-in, which is the limit on logic functions per block, so you are limited to a XX IP AND gate ( XX = varies with brand 36/40/..) With spare I/O pins, you can also merge logic, so the physical limit will be all IP - jgArticle: 47771
Scott Thibault wrote: > > Tcl interpreter uses a bytecode compiler internally to improve performance. > That bytecode is not suitable for hardware implementation, however. We > designed a special bytecode for this purpose, and yes, the FPGA is like a > virtual machine for this bytecode. > > --Scott Sounds interesting - Can you post a small example of the flows ? - something like tiny source code / intermediate sizes / final speed, and size of the Tcl engine itself.. ? Not everyone here will know Tcl in detail, but the general application of script handling within FPGA is usefull to get a handle on. - jgArticle: 47772
In article <upolclncc0qr16@corp.supernews.com>, Scott Thibault <thibault@gmvhdl.com> wrote: >Tcl is not so different from Scheme if you replace []'s with ()'s, but ... Actually, its a big difference due to scoping rules. Tcl's scope semantics (dynamic scope) is a BUG, but a bug which arrises from its original incarnation as string munging which meant that it couldn't have proper closures. This is a big issue as the Tcl/TK windowing model is rightly patterend around the notion of binding closures to events. Yet binding closures to events are much more predictible and useful with lexical scope, a total nightmare with dynamic scope. >there are a couple of advantages to using Tcl. First, memory is managed with >reference counting, which is simple and predictable. Only because Tcl doesn't allow real structures & references. Reference counting can't collect cyclic structures. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 47773
"Christopher R. Carlen" <crcarle@sandia.gov> wrote in message news:3D9C9447.70908@sandia.gov... > markp wrote: > > "Christopher R. Carlen" <crcarle@sandia.gov> wrote in message > > news:3D9C5E6E.5060500@sandia.gov... > > <snip> > > > >>Oh, one other important thing! The CPLD runs at 3.3V, so I need level > >>shifting anyway. > >> > >>Thanks for your interest. > >> > >>Good day! > >> > > > > > > If you're using buffers you might want to consider using 74LVC4245A level > > shifting buffers from Philips, that way you can drive from the PLD 3.3V > > logic and be able to select 3.3V or 5V CMOS levels on your outputs. > > > > http://www.philipslogic.com/products/lvc/pdf/74lvc4245a.pdf > > > > Mark. > > > > > > Hmm, in this case there is zero likelyhood of having to output 3.3V > levels. > > What is the likelyhood that commercial instruments in the next few years > will have user inputs that are 3.3V level instead of 5V TTL compatible > levels? > Pretty low I guess, they'd probably make them backwards compatible with bomb proof inputs anyway. > There are so many logic levels these days, it makes sense to keep the > external world interface standardized on one thing, so all instruments > can talk to each other. 5V works for me. I hope it stays that way. > OK, the reason I mentioned it is because it is easy to go from LVTTL to 5V CMOS, and easy to go from LVTTL to 3.3V CMOS, but it's quite difficult to have LVTTL inputs and to be able to select 3.3V or 5V CMOS outputs without fitting different chips. Interest value only really. Mark.Article: 47774
[Disclaimer: I'm employed by ARC, but I'm an engineer, not in marketing...] In article <ani10j$9pb$1@slb7.atl.mindspring.net>, Jan Gray wrote: > Another point on the curve is to add hardware functional units to > your soft processor core to do the expensive inner loop computations > (e.g. the packet checksum example) in a dedicated datapath, either > explicitly, or as a side effect of loading/storing the data. Certainly this is the focus of companies like ARC and Tensilica. Make a baseline GPP or DSP and allow the end user to add new instructions via the control and datapaths of the CPU. The software tools benchmark and profile where you waste your time and you can then determine what functions make sense to augment, either with customer opcodes, compute engines, etc. It's taking time but more customers are getting used to the idea. It helps tremendously in area, performance, and low power applications. Jan takes it to an interesting tangent by focusing on cool FPGA optimization techniques to embed lots o' small processors on reprogrammable logic. In our ASIC domain a number of our customers embed multiple ARC processors on a single piece of ASIC silicon. Regarding your comments about DMA engines, I find most of our customers, *especially* on the USB side, are very nervous at first about have the peripheral put the data right where you want it (or pull it right from memory) -- that's been our approach. There's something very simple about simply reading and writing to FIFOs. Unfortunately if an external (to the peripheral) DMA engine reads and writes to peripheral FIFOs, you're using twice the memory bandwidth. (read from the FIFO, put it in memory) If the uProcessor does the movement to/from the FIFOs, you're wasting a ton a bus bandwidth between the opfetches and the actual data movement. Tell the peripheral where to put the payload data and let it read/write directly with a protocol aware DMA engine. Doing that scares a bunch of customers in the embedded domain. > In the xr16 design, I replaced a single PC register with a 16-entry > "PC register file" that makes it easy to run multiple threads or do > multiple channels of DMA. Datapath cost (n-bit wide datapath): +n > LUTs, -n FFs. This idea can save you any number of DMA address > counter(s), address mux(es), and the address mux delay(s), elsewhere > in the design.. Interesting about the multiple threads: do you rotate the general purpose register file of the xr16 (between multiple register files) in sync with the PC thread rotation? How do you keep context between the threads? We have two sets of x86 HDL cores -- Classic and Turbo. In our Turbo186 we threaded the standard programmable DMA engine tightly into the control of the general uProcessor execution engine to eliminate arbitration. If there are any DMA channels pending operation, the bus cycles are rotated automatically by the execution state machine: DMA read (1 cycle), opcode execution (1 cycle of it), DMA write (1 cycle), DMA read (1 cycle), opcode execution (1 cycle of it), DMA write (1 cycle), [...] This way there is no arbitration and the processor can't be starved by big DMA movement. The DMA engine needed the dead cycle anyway so at least the processor got to do some useful work in between. > Thus it seems to me that an enhanced { 200 LUT, 1 BRAM } 16-bit RISC > soft core could make a pretty capable MAC, and as a bonus you can > build a (software) TCP/IP engine for zero additional LUTs. > Jan Gray, Gray Research LLC We had at least two customers take our small V8 RISC softcore and bolt it to our MAC cores to build things like filters, switches, bridges, network aware devices. Yes, they could prototype them in cheap Spartans. It all depends on how advanced you need your TCP/IP processing to be. Our V8 may not be quite as small as the xr16 but it's in HDL and was targetted at ASICs, not the neat tricks you can play in FPGAs at the low level (esp the embedded RAMs). I've read your papers on the xr16 though. (a couple of years ago) Very cool stuff. Have you made progress on the compiler/software side in the past couple of years? -- Scott Bilik http://BilikFamily.com/
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z