Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Steven, Good post. Thanks for the Triscend info. I thought I'd just point out a few omissions about the XA10 device since I have a bit of knowledge about the part... First of all, the device is currently shipping to customers. Demand is pretty high right now, so they're a bit hard to come by, but they are shipping today. Second, the XA10 is the largest device in the Excalibur ARM family. Compared to the Triscend devices, the Excalibur XA10 is much more "logic-heavy", having around 38,000 FFs/LUTs compared to less than 4,000 in the largest Triscend ARM device (I'm assuming that a CSL is somthing like a FF/LUT). There are two other Excalibur ARM devices coming soon, one with about 16,000 FFs/LUTs (XA4) and another with somewhere around 4,000 (XA1). These will probably be much closer in price to the Triscend parts than the XA10 is. The last thing to point out is that the Excalibur ARM devices are of the ARM 9 family (rather than the ARM 7) and that system frequencies are significantly higher than the Triscend parts. -Pete- Steven K. Knapp <sknapp@triscend.com> wrote in message news:d2f86928.0111021647.56104ffb@posting.google.com... > There is an article from EE Times that provides quite a bit of data on > both parts that you mentioned. > http://www.eet.com/story/industry/semiconductor_news/OEG20011031S0025 > http://www.eetimes.com/story/OEG20011016S0097 > > Altera has demonstrated samples of the XA10 at trade shows and there > are parts listed at distribution, staring at over US$2,000 each > (single quantity). Looks like Altera's MIPS project is on the back > burner, though. All information on the MIPS-based product was removed > from their web site. > > I haven't seen a physical Virtex 2 Pro device yet but I know of at > least one company that is an "alpha" site. There doesn't appear to be > any datasheet on the Xilinx web site and I don't see any parts listed > at distribution. > > Both companies presented thier solutions at the Microprocessor Forum > in October. These are exciting parts, assuming that you can afford > them. > > For comparison, there are other companies are already shipping > similar, more cost-effective devices in production. Triscend, for > example, has been shipping 32-bit ARM-based devices with embedded > programmable logic for over a year now. The Triscend A7 family is in > production, available through distribution, and is supported by > compilers, synthesizers, simulators, development boards, etc. The > largest family member starts at about US$40, single piece. The A7 > configurable system includes cache, DMAs, UARTs, on-chip SRAM, SDRAM > controller, and lots of I/O pins. > http://www.triscend.com/products/indexa7.html > > For smaller systems, our 8-bit accelerated 8051-based E5 family has > parts starting as low as US$9 in single-piece quantities, or below > US$4 in high volumes. Like the A7 family, the E5 family is well > supported by development tools and boards. The E5 family has been in > production since 1999. > http://www.triscend.com/products/IndexE5.html > > Both product families are supported by the Triscend FastChip > development system and the FastChip drag-and-drop IP library, complete > with standard peripheral functions. > http://www.triscend.com/products/IndexFCintro.html > http://www.triscend.com/products/indexfc_ip.html > > > "Jae-cheol Lee" <jchlee@lge.com> wrote in message news:<dvHD7.1657$cI6.685978@news.bora.net>... > > There were some news about Virtex II with PowerPC cores. > > Altera announced Excalibur series with ARM or MIPS cores. > > > > What is the schedule of production of the first one? > > > > Is there any good story on the use of the second one? > > > > Please let me know...Article: 36251
I have never used Lattice's devices, so I don't know much about them. However, regarding PCI, I don't know how things will go with a CPLD, but at least with an FPGA from my experience trying get my own PCI IP core to meet timings with Xilinx Spartan-II 150K system gate part, it is very hard for a synthesizable (HDL-based) PCI IP core to meeting timings of even 33MHz PCI (Tsu = 7ns and Tval (Tco) = 11ns) with only automatic Place & Route (in other words, without using a floorplanner tool). I know that Xilinx and Altera has a PCI IP core that support 66MHz and 64-bit PCI, but it looks like Xilinx has much more devices supported compared to Altera (Altera seems to have constraint file for only a few devices, versus Xilinx has constraint file for many more devices than Altera). Some people may be concerned that an SRAM-based FPGA may not program by the time the BIOS starts executing POST (Power On Self Test) code, but I am told that an FPGA that can handle PCI can program itself within 100ms, before RST# (PCI's reset signal) gets asserted to reset all PCI devices. POST code will get executed after that. Although I am not sure if your application really requires more than 133MB/s of bandwidth, but if it doesn't (in other words, 32-bit 33MHz PCI is adequate), you can prototype your design with Insight Electronics Spartan-II PCI development board which costs only $145 for the board itself (the same one I use). Although the Spartan-II part on the Insight Electronics Spartan-II PCI development board is called 150K system gates, realistically usable gates will be much lower. One problem with 64-bit 66MHz PCI (3V) is that very few systems support it compared to ubiquitous 5V 32-bit 33MHz PCI, but if you don't care about running your PCI card on regular desktop computers, that shouldn't be an issue. Regards, Kevin Brace (don't respond to me directly, respond within the newsgroup) arast@inficom.com (Alex Rast) wrote in message news:<ANEE7.153$FU5.365042@news.uswest.net>... > Lattice has a core on their site for 32-bit PCI, but I'm wondering if there is > available from Lattice or third parties a 64-bit core. It would be ideal if it > can run at 66MHz to boot. I'm looking to target an ispLSI8600VE. Such a core > would be helpful because it would let me save an additional chip on our > circuit board. Right now we're using a different (much smaller) CPLD on our > board for other, non-PCI functions. I'm getting ready to design the second > revision. I'm leaning towards the Lattice chip because I've been unsatisfied > with the s/w tools for the chip I have now, and because I really could use the > internal tristate busses on the 8000 series. The 8600VE is way, way overkill > in terms of macrocell density as a direct replacement for our current CPLD, > but if I could integrate the PCI core onto the chip as well then I think it's > justifiable. > > BTW, anybody out there have any experience with Lattice's tools? What are your > thoughts on them? > > Alex Rast > arast@qwest.net > arast@inficom.comArticle: 36252
It looked to me that the Spartan CLB does not include any dedicated carry logic. Each LUT has one output, so at least two LUT's would be required per bit... one to generate the cout bit, and one to generate the pc bit. Have you tried looking at the design after P&R with the FPGA editor? This may shed more light on the situation. Newman "Tim Boescke" <t.boescke@tu-harburg.de> wrote in message news:<9rvn5u$10hndj$1@ID-107613.news.dfncis.de>... > I am currently trying to synthesize a loadable > accumulator with synopsis. The target architecture > is a spartan. (not 2) > > In my opinion the code below should fit into one 4 LUT > per bit. (inputs to each 4 LUT: pc, cin, load, inp) > However, after synthesis the design requires no less > than 16 4-luts. > > Did I miss something ? Is there any way to infer a > combined add/load structure ? I already tried > lots of combinations without success and unfortunately > it seems that the xilinx libs dont allow direct > access to the LUTs and the carry logic for spartan.. > (They do for spartan 2) > > ------------------------------------------------------ > > architecture synth of counter is > signal pc: std_logic_vector(7 downto 0); > begin > process(clk) > begin > if (res ='1') then > pc <= "00000000"; > elsif rising_edge(clk) then > if (load = '1') then > pc <= inp; > else > pc <= pc + inp; > end if; > end if; > end process; > > outp <= pc; > end synth;Article: 36253
Spartan is based on XC4000, and SpartanXL is based on XC4000XL, and all of them have the same carry structure. Take a look at the XC4000 and XC4000XL documentation, it may be clearer. But it describes the identical architecture. Peter Alfke ==================================== newman wrote: > It looked to me that the Spartan CLB does not include any > dedicated carry logic. Each LUT has one output, so > at least two LUT's would be required per bit... one to > generate the cout bit, and one to generate the pc bit. > > Have you tried looking at the design after P&R with the > FPGA editor? This may shed more light on the situation. > > Newman > > "Tim Boescke" <t.boescke@tu-harburg.de> wrote in message news:<9rvn5u$10hndj$1@ID-107613.news.dfncis.de>... > > I am currently trying to synthesize a loadable > > accumulator with synopsis. The target architecture > > is a spartan. (not 2) > > > > In my opinion the code below should fit into one 4 LUT > > per bit. (inputs to each 4 LUT: pc, cin, load, inp) > > However, after synthesis the design requires no less > > than 16 4-luts. > > > > Did I miss something ? Is there any way to infer a > > combined add/load structure ? I already tried > > lots of combinations without success and unfortunately > > it seems that the xilinx libs dont allow direct > > access to the LUTs and the carry logic for spartan.. > > (They do for spartan 2) > > > > ------------------------------------------------------ > > > > architecture synth of counter is > > signal pc: std_logic_vector(7 downto 0); > > begin > > process(clk) > > begin > > if (res ='1') then > > pc <= "00000000"; > > elsif rising_edge(clk) then > > if (load = '1') then > > pc <= inp; > > else > > pc <= pc + inp; > > end if; > > end if; > > end process; > > > > outp <= pc; > > end synth;Article: 36254
Hi... Well i was very much asking from the veiw point of FLEX10K and APEX20K architectures. U see in 10K they had one output of the LE going to the feedback matrix of the LLI and one output to RFTs and CFTs. So that meant only either registered or unregistered output going to the LLI feedback matrix and also to the RFT's and CFTs. But that changed with APEX20K devices where both outputs could got to all routing resources. Now is that becuse they wanted both registered and unregisterd outputs at the same time...? What was the need that made them to do this change in the architecture...? I presumed there might be circuits where one might need that. Well some more discussion on this will help a lot. Nitin. Ray Andraka <ray@andraka.com> wrote in message news:<3BE298CA.B67829D6@andraka.com>... > Depends entirely on your design style. For highest performance, you'll > generally want to avoid using the unregistered output, but that also > limits your design options. > > nitin wrote: > > > Hi... > > > > Can anyone tell me how frequently and where both registered and > > unregistered outputs from an LE are required...? > > > > Ciao, > > nitin. > > -- > --Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email ray@andraka.com > http://www.andraka.com > > "They that give up essential liberty to obtain a little > temporary safety deserve neither liberty nor safety." > -Benjamin Franklin, 1759Article: 36255
Hi... Well actually i was also pondering over the Mercury Architecture and questions do come to mind. Well i still can't understand why not 5 top LEs drive to the right and 5 bottom LEs to the left or something like that. And this odd and evn LEs in a LAB driving different resources carries on even to the RAPID LAB INTERCONNCT, and also the LEAP LINES... why...? does this give some bigger region os a crudely defined cluster with faster routing within it? If yes then what motivates that kind of descision and what sort of shape & size does that crudely defined cluster have...? Well i aslo spotted an intresting thing in Mercury architecture... Uptil now horizontal rsouces used to drive vertical ones ad vertical ones used to drive horizontal ones. But in Mercury we see Leap lines driving columns, columns driving columns, and priority comlumns driving columns as well as priority columns. Well first of all the question that comes to mind is the when a column drives a column is it driving a segment within itself...? I certainly hope not. Cuse that does not make too much sense to me. But then is it drives adjacent columns then which ones? to the right or to the left or both? And what motivates such a decision? Well i hope some one has the answers... Ciao, Nitin. Ray Andraka <ray@andraka.com> wrote in message news:<3BE30B9B.FD358F22@andraka.com>... > Right, but the real reason is to provide fast connections for logic using the > carry/cascade chains. Those chains run across the LE's in a LAB, which prevents > the LE's from connecting to another LE in the same LAB. The inter-lab connects > give you a way to connect arithmetic logic with a reasonable delay. > > Steve Fair wrote: > > > Digari - > > > > It's all about speed . . . > > > > The interleaving of the labs gives the router more flexibility. Each lab > > has it's own local routes, which is the fastest non-dedicated route (as > > opposed to carry or cascade chains). If all the logic between two flops can > > fit into a lab, you will achieve the best performance possible. If the > > logic can't fit into the lab, you go to a megalab route in the apex II > > architecture, which adds delay AND uses another routing resource. By > > interleaving the labs, an LE can be connected to many more LE's for making > > those fast, local connections. The area expense isn't that great (a single > > line and mux to the next lab's local interconnect), so it's a very efficient > > way to increase routing and performance. Put another way, a lab goes from > > having 9 possible local connections to 19 with very little overhead. With > > the further interleaving available (remember the left & right drives), you > > can do some pretty deep equations with very small routing delays. > > > > Hope that helps. > > > > Steve > > > > "digari" <digari@dacafe.com> wrote in message > > news:e0855517.0111010034.375d9328@posting.google.com... > > > Why there is interleaved routing from an LE to adjecent LLIs in > > > Mercury and ApexII architectures. > > > > > > "APEX II devices use an interleaved LAB structure, so that each LAB > > > can > > > drive two local interconnect areas. Every other LE drives to either > > > the left > > > or right local interconnect area, alternating by LE." > > > > > > Can anyone shed some light on the alternate routing structure of LEs > > > within a LAB. > > -- > --Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email ray@andraka.com > http://www.andraka.com > > "They that give up essential liberty to obtain a little > temporary safety deserve neither liberty nor safety." > -Benjamin Franklin, 1759Article: 36256
What worked for me: I explicitly inserted BUFGs on all signals that I wanted a BUFG on, and then set in the Synplicity constraint file (project.SDC) the attribute xc_global_buffers to the number of BUFGs in the design. Synplicity generated warning that some signals appeared to be clocks, but didn't add any unwanted BUFG. Note that the xc_global_buffers attribute can be specified only in the SDC file. As far as I could see, Synplicity simply ignored syn_noclockbuf in the HDL code, since it know better than me what was and what wasn't a clock... "Jason T. Wright" <Jason.T.Wright@Boeing.com> wrote in message news:<3BE183E6.204EA0C9@Boeing.com>... > How does one prohibit Synplify from inferring a global buffer? I read > the on-line help, and it mentions using an attribute in the VHDL (or > verilog) code to FORCE a global buffer, but negating that attribute did > not stop its insertion. LeonardoSpectrum and FPGAExpress each has a > command to stop such an undesired action (or, correspondingly, to force > such an insertion.) Left to their own devices, the tools can create > wonderfully efficient, or wonderfully bloated, results from a user's > code. I've seen a little bit of each.Article: 36257
Why on earth does the Altera web site not allow downloads to be resumed? Every other web site seems to allow this. I tried emailing their webmaster about it, but didn't get a reply. Leon -- Leon Heller, G1HSM leon_heller@hotmail.con http://www.geocities.com/leon_heller Low-cost Altera Flex design kit: http://www.leonheller.comArticle: 36258
I am guessing that you are trying to download their free tools. I tried to do the same thing, but because they put their files on an FTP server, various download utilities like Download Accelerator cannot resume the download. I guess Altera thinks everyone has access to T1 lines. I gave up and ordered their latest Altera Digital Library, and the latest version (October 2001) had Quartus II 1.1 Web Edition, but it looks like they got rid of MAX+PLUS II-BASELINE from the CD-ROM. I don't like Quartus II 1.1 Web Edition's TalkBack feature (of course, I disabled it), and for some reason MegaWizard Plug-In Manager doesn't seem to work properly (Java-based, and somehow can't find the right file). Regards, Kevin Brace (don't respond to me directly, respond within the newsgroup) "Leon Heller" <leon_heller@hotmail.com> wrote in message news:<9s1g4d$2lg$1@plutonium.btinternet.com>... > Why on earth does the Altera web site not allow downloads to be resumed? > Every other web site seems to allow this. I tried emailing their webmaster > about it, but didn't get a reply. > > LeonArticle: 36259
Alex Rast wrote: > Lattice has a core on their site for 32-bit PCI, but I'm wondering if there is > available from Lattice or third parties a 64-bit core. It would be ideal if it > can run at 66MHz to boot. The PCI core from www.opencores.org should run at 66 Mhz. They've been working on a 64-bit version, but I don't know, if it's done yet.Article: 36260
Hi I am looking for stand along FPGA developement baords Must Have : 1. ~ 800 K (more or less) gates (prefer Virtex or E, although Alter will do :)) 2. ~ 32MBytes or more, SDRAM (or DDR) (prefer 64bits DIMM ) 3. SSRAM (any type), independent of SDRAM. 4. some Flash 5. digitized video input, (expension card acceptable) 6. RAMDAC (resonable quality 800x600x24 will do ) for VGA output Wish List : 1. RS232 2. Ethernet 3. Parallel port, 4. USB 5. digitized Audio in/out. I have found one which is close but lack SDRAM !! http://www.xess.com/ XSV-800 boards Any pointers will be much appreciated. btw. I am poor student, so can't effort huge sums of money :)Article: 36261
Hello, We have a problem configuring Altera APEX EP20K100 device on our prototype video codec card. The first option for our card is when DAC chip on it supplies two clock signals to the APEX (through i/o pins). The second option is when DAC chip accepts two clock signals from APEX. The rest of external APEX-related signals are tri-stated or left unconnected (mezzanine ADC card). We are using JTAG downloader (Quartus II Programmer Tool). The second option works well. APEX begins its operation after successful configuration via JTAG. But the first option does not work at all. We cannot get APEX configured and Quartus reports Configuration Failed at 98% progress bar indicator. In this particular project we can enforce second option (totally tri-stated environment) at start-up so as to get APEX configured, but our next large project (now in progress) is build around several multi-FPGA cards (APEX, mostly), and we cannot obey this limit there. Thank you. Igor Kauranen P.S. I would like to say thank you very much to Peter Ormsby and to Carl Schlehaus for their help with Nios.Article: 36262
Patently untrue. The Spartan architecture is basically the xilinx 4000E architecture. It has a dedicated carry chain _in_front_ of the LUTs. To instantiate it, you use the CY4 primitive plus one of the CY4 mode select primitives, which is connected to the CY4 with an 8 bit bus, selecting the right CY4 mode set for the function you wish to instantiate. You'll have to refer to the carry section of the libraries guide for the details. That said, synthesis should instantiate the carry chain. IIRC (It has been a little while since I last used the 4K architecture), synplicity doesn't infer the carry chain for 4K devices when there are less than about 6-8 bits in the arithmetic/count function. You may have to massage your description to make it into something the synthesizer can successfully infer. Try putting the mux and add outside the process as a concurrent statement. newman wrote: > It looked to me that the Spartan CLB does not include any > dedicated carry logic. Each LUT has one output, so > at least two LUT's would be required per bit... one to > generate the cout bit, and one to generate the pc bit. > > Have you tried looking at the design after P&R with the > FPGA editor? This may shed more light on the situation. > > Newman > > "Tim Boescke" <t.boescke@tu-harburg.de> wrote in message news:<9rvn5u$10hndj$1@ID-107613.news.dfncis.de>... > > I am currently trying to synthesize a loadable > > accumulator with synopsis. The target architecture > > is a spartan. (not 2) > > > > In my opinion the code below should fit into one 4 LUT > > per bit. (inputs to each 4 LUT: pc, cin, load, inp) > > However, after synthesis the design requires no less > > than 16 4-luts. > > > > Did I miss something ? Is there any way to infer a > > combined add/load structure ? I already tried > > lots of combinations without success and unfortunately > > it seems that the xilinx libs dont allow direct > > access to the LUTs and the carry logic for spartan.. > > (They do for spartan 2) > > > > ------------------------------------------------------ > > > > architecture synth of counter is > > signal pc: std_logic_vector(7 downto 0); > > begin > > process(clk) > > begin > > if (res ='1') then > > pc <= "00000000"; > > elsif rising_edge(clk) then > > if (load = '1') then > > pc <= inp; > > else > > pc <= pc + inp; > > end if; > > end if; > > end process; > > > > outp <= pc; > > end synth; -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 36263
We produce DSP modules which support small daughtercards for various types of IO. Each of these daughtercards require a different interface to the DSP module. The obvious way to handle this is to use FPGAs which can be downloaded with the design unique to each daughtercard installed. On our first design we used one FPGA as a main board controller and used a separate FPGA for each daughtercard interface. This allowed us to load the FPGAs at boot up time after determining the identity of the daughtercards installed. The problem with this approach is that we are increasing the number of daughtercards supported from 2 to 4 and we are using a very small form factor main board (PC/104). So we are very cramped for space for two more FPGAs. This becomes a bigger problem as we move to newer FPGA families which do not support the small TQ100 package. Of course there are the uBGA packages, but they make board routing much harder with so many in such a small space. We can get a significant cost and size savings using a single, larger FPGA. But this complicates the download process since the daughtercard interfaces are then all in a single download. If we want to handle all possible configurations, we will have to provide literally thousands of possible bitstreams. Can we achive the same real time configuration as in the multi-FPGA approach or at least allow a user a way to build the appropriate download for the main module FPGA using JBITS? We don't want to require a user to download FPGA tools and perform a place and route just to reconfigure their boards. Can we do that with JBITS, use separate modules for the interfaces and combine them without using the place and route tools? -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 36264
Tim wrote: > > "Peter Alfke" <palfke@earthlink.net> wrote > > Irwin Kennedy wrote: > > > * Use less "leaky" transistors! (?) > > > > Until recently, there was extremely little junction leakage or > > sub-threshold leakage current. XC3000L could run on 50 microamps. > > Unfortunately, as we approach 100 nm technology, subthreshold leakage > > current becomes significant ( for every IC manufacturer. Ask Intel! ) > > I should not have to ask, but what is sub-threshold leakage? A CMOS transistor that is turned off still conducts current. This is only a pico Ampere or so per transistor but with a 100 million transistors in a large FPGA this can become a concern. The leackage current depends heavily on the threshold voltage at which the transistor is turned off. The speed of the transistor - the amount of current it conducts if turned on - also depends on the threshold voltage. This means that there is a tradeoff between speed and leakage current. The threshold voltage can be controlled by the voltage on the fourth "bulk" connection of the transistor. In non SOI technologies this costs a lot of area or might even be impossible, but in principle in an FPGA each slice could be configured for a differen speed/leakage tradeoff. This way the speed for a given leakage current could be improved by budgeting more leakage to the critical path. Kolja SulimmaArticle: 36265
rickman <spamgoeshere4@yahoo.com> writes: > On our first design we used one FPGA as a main board controller and used > a separate FPGA for each daughtercard interface. This allowed us to load > > The problem with this approach is that we are increasing the number of > daughtercards supported from 2 to 4 and we are using a very small form > factor main board (PC/104). So we are very cramped for space for two > > We can get a significant cost and size savings using a single, larger > FPGA. But this complicates the download process since the daughtercard > > Can we achive the same real time configuration as in the multi-FPGA > approach or at least allow a user a way to build the appropriate > download for the main module FPGA using JBITS? We don't want to require > a user to download FPGA tools and perform a place and route just to > reconfigure their boards. Can we do that with JBITS, use separate > modules for the interfaces and combine them without using the place and > route tools? Should be possible with JBits. Basically take the large FPGA (JBits demands that it must be a Virtex (no -E or -EM !) or an Spartal-II of an size also existing in Virtex) and "cut" the FPGA space into one "master" section and 4 "expansion" sections. Cutting should be preferably vertical, if run-time reloading is needed (because Virtex reloads columns), else use any 4 identical sized sections. Something like this: .-----------------------------------. | .--..--..--..--. | | | || || || | | | | || || || | | | | || || || | | | | || || || | | | | || || || | | | | || || || | | | base |ex||ex||ex||ex| | | | 1|| 2|| 3|| 4| | | | || || || | | | | || || || | | | | || || || | | | | || || || | | | | || || || | | | `--'`--'`--'`--' | `-----------------------------------' or like this (if not run-time reconfig): .-----------------------------------. | .-------. .-------. | | | | | | | | | ex1 | | ex2 | | | | | | | | | `-------' `-------' | | | | | | base | | | | | | .-------. .-------. | | | | | | | | | ex3 | | ex4 | | | | | | | | | `-------' `-------' | `-----------------------------------' Then generate an "base" .bit bitstream for the master stuff. This will have to offer 4 "ports" (basically a set of routing lines and signals on them leading into the master part) for attaching the expansion parts. The "ports" represent a) connection to "base" stuff and b) connection to the appropriate slot. From here on implementation differs on how much you want the user of the board to do: 1. user has to install JBits on their system Simply program the expansions as .java files that use an standard call interface and than have the user edit an small config file that says which of these files is to be "included" onto which port of the master. Then let JBits compile. I suspect you do not want to go this way, as it requires users to install JBits software, but JBits compiles small program in minutes, so it may be OK, if you are just worried about config time. 2. User has JBits, but faster way Use JBits in the base, in a way that an standard "pinout" is made that interfaces the group of expansion CLBs to "base". "Pins" in this sense being actual FPGA routing wires (them 24 north/east/south/west wires). Just for the base->expansion wires you will have to use "raw" jbits.set() functions on both sides of the "divide". Then make one .bit with the base and with 4 "holes" (CLBs left unconfigured) where the expansions would go. Then have JBits load this pre-compiled .bit and expansions in .java files and have just add the expansion parts. This still requires the users to have JBits, but will be below minutes to compile. And no "base" source, if that is relevant to you. 3. user has no JBits at all on their system Take the same .bit with holes in it from above. Same "pins" interface. Then have one precompiled .bitx file each for each expansion modules (made by an module specific .java), containig just the CLBs for one subsection (always fitted to, say, the first hole). Then use the information in XAPP155 to write an own program that "cutouts" the 4 user-selected .bitx sections and "merges" them into the base .bit file. XAPP155 documents the arrangement of the basic 48x18 bits that configure one CLB. -- Neil Franklin, neil@franklin.ch.remove http://neil.franklin.ch/ Hacker, Unix Guru, El Eng HTL/BSc, Sysadmin, Archer, Roleplayer - Intellectual Property is Intellectual RobberyArticle: 36266
>Hmm, dont know so much about the 3k series in the good old times. But I >think all these 32 3k parts can be combined into 1 or two uptodate parts >(Virtex-E, Virtex-II) I expect so but it would be a huge redesign. Interesting though. >> The 3k series device is now obsolete. I would also not like to risk >> rebuilding this board with some newer XC3064s because I know the more >> recent devices are much faster and some structures don't work anymore, > >Nasty asynchronous tricks ?? ;-) Not so much "nasty" as OK with the older parts - it was OK for example to use a "long line" for clock distribution. One used the "L" attribute, together with perhaps "SC=1" on the clock net to force its allocation to a long line. I was very specifically told by a Xilinx engineer (c. 1991) that this was OK. And it worked faultlessly. This was because the worst-case clock net skew was less than the clock-Q delays in the D-types. What happened later, even going from slow to faster XC3000 devices, was that the clock-Q timing improved a lot faster than the local interconnect timing. This IMHO is to be expected when you speed up your gates; speeding up the interconnect by the same factor is a lot harder. So a lot of previously solid clock gating schemes got broken. Obviously, nowadays, you would design everything to work off the global clock nets, and use clock-enable to do the clock gating. Unfortunately, I also used these parts for ASIC prototyping, and when you are doing a low power design, the most important method by far to reduce dynamic power is to gate clocks. Using a global clock net and a clock enable does not reduce dynamic power, well hardly anyway. On those designs I got my fingers burnt a bit :) Peter. -- Return address is invalid to help stop junk mail. E-mail replies to zX80@digiYserve.com but remove the X and the Y. Please do NOT copy usenet posts to email - it is NOT necessary.Article: 36267
They range from a few dozend FF = a few hundred gates to several millions. There are EEPROM based chips and SRAM based chips. The boundary is somwhere at 128 FF's. The higher density ones use SRAM then. For small scale development, the chip costs are nil compared to the time and perhaps money spent on the software. eg 20$ for 30k gates. Yes, you can build a cpu or a 3D engine, but these are not the projects to start with. A thick chip with 10million gates also requires a thick machine with 1G RAM and software for a few k$. Rene -- Ing.Buero R.Tschaggelar - http://www.ibrtses.com bazaillion@yahoo.com wrote: > > Hello, > > I am new to FPGA/PLA/CPLD technology. > > I have a couple questions about them. > > (1) How dense can you get these (How many gates)? > > (2) Hiw much are they typically if you are buy small quantities 1-5 > for prototyping? > > (3) Are they dense enough to build a CPU or a 3d VGA display chip > with these? or maybe multiple chips. > > Thanks, > -M. BazaillionArticle: 36268
On 31 Oct 2001 09:02:55 -0800, husby_d@yahoo.com (Don Husby) wrote: >Mike Treseler <mike.treseler@flukenetworks.com> wrote in message news:<3BDF4799.A379AA16@flukenetworks.com>... >> Russell Shaw wrote: >> > ... >> > most of the >> > device settings you do in the flow tabs don't get >> > saved with the project. >> >> Haven't seen anything like that. > >Leonardo is full of things like that: project settings >flip around like a bad politician. Constraints evaporate. >Constraints are added. Flags appear to be set, but really >aren't. File names are changed. My favorite is when >each source file is added twice to the compile list. >The most consistently annoying is when the output file >name is silently changed - sometimes you don't discover >it until you've done a complete place and route, and find >that nothing changed. > >At some point, you just have to delete all of the Leonardo >project files (*.lsp *.scr *.xdb ...) and start over. > >It's amazing to me that the Leonardo user interface has >been so crappy for so many years. At some point, you >just have to delete Leonardo and start over. Current favourite is the "stop" button you press when you notice one of these absurdities, which puts you on ignore, then finally aborts the current phase of processing several minutes later, then continues with all the other phases anyway. Dumb. Just dumb. Or the "exit" button that politely tells you to press the "stop" button before exiting Leonardo. Or the way it picks the last file in the input file list as a suitable name for the design output ( reasonable) but the first in the list for the constraint file name! I get the impression they test the synthesis engine, but never bother about the UI because everybody uses the command line interface. Wonder if it ever occurred to them, that was because their UI is so bug-ridden. - BrianArticle: 36269
As I have posted before, old XC3000 parts are still around, and a switch to XC3000L might solve the excessive logic speed problem. Peter Alfke, Xilinx Applications ==================================== Peter wrote: > >Hmm, dont know so much about the 3k series in the good old times. But I > >think all these 32 3k parts can be combined into 1 or two uptodate parts > >(Virtex-E, Virtex-II) > > I expect so but it would be a huge redesign. Interesting though. > > >> The 3k series device is now obsolete. I would also not like to risk > >> rebuilding this board with some newer XC3064s because I know the more > >> recent devices are much faster and some structures don't work anymore, > > > >Nasty asynchronous tricks ?? ;-) > > Not so much "nasty" as OK with the older parts - it was OK for example > to use a "long line" for clock distribution. One used the "L" > attribute, together with perhaps "SC=1" on the clock net to force its > allocation to a long line. I was very specifically told by a Xilinx > engineer (c. 1991) that this was OK. And it worked faultlessly. This > was because the worst-case clock net skew was less than the clock-Q > delays in the D-types. > > What happened later, even going from slow to faster XC3000 devices, > was that the clock-Q timing improved a lot faster than the local > interconnect timing. This IMHO is to be expected when you speed up > your gates; speeding up the interconnect by the same factor is a lot > harder. So a lot of previously solid clock gating schemes got broken. > > Obviously, nowadays, you would design everything to work off the > global clock nets, and use clock-enable to do the clock gating. > > Unfortunately, I also used these parts for ASIC prototyping, and when > you are doing a low power design, the most important method by far to > reduce dynamic power is to gate clocks. Using a global clock net and a > clock enable does not reduce dynamic power, well hardly anyway. On > those designs I got my fingers burnt a bit :) > > Peter. > -- > Return address is invalid to help stop junk mail. > E-mail replies to zX80@digiYserve.com but remove the X and the Y. > Please do NOT copy usenet posts to email - it is NOT necessary.Article: 36270
Hi, I have checked the datasheet of Xilinx that the speed of an adder in xv1000e-6 (1.8v) is very fast: 16 bit 4.x ns 64 bit 6.x ns But the timing analysizer report something different: CLB_R27C37.S1.G3 net (fanout=1) 2.084R jk<1> CLB_R27C37.S1.COUT Topcyg 1.000R j<0> ijk_add/C2/C3/C0 ijk_add/C2/C3/C2 CLB_R26C37.S1.CIN net (fanout=1) 0.000R ijk_add/C2/C3/C2/O CLB_R26C37.S1.COUT Tbyp 0.149R j<2> ijk_add/C2/C4/C2 ijk_add/C2/C5/C2 CLB_R25C37.S1.CIN net (fanout=1) 0.000R ijk_add/C2/C5/C2/O CLB_R25C37.S1.Y Tciny 0.677R j<4> ijk_add/C2/C6/C2 ijk_add/C2/C7/C1 CLB_R30C32.S0.F4 net (fanout=2) 1.758R j<5> CLB_R30C32.S0.X Tilo 0.468R C8/N49 C741 where Tbyp is the Cin to Cout speed, this is quite fast but the Topcyg is the time for Ginput to Cout. This should be the carryout generated in the upper LUT. If this is the delay, 64 bit adder will never be 6ns but 64 ns at least. But why the report just count one of it but not all in the path? what I do is: jk <= j + k; ijk <= jk + i; help me pls ---- BrittleArticle: 36271
bazaillion@yahoo.com wrote in message news:<3be44c04.109339469@news.charter.net>... > Hello, > > I am new to FPGA/PLA/CPLD technology. > > I have a couple questions about them. > > > (1) How dense can you get these (How many gates)? > Although, I don't use a CPLD myself, a fairly large CPLD can fit about 10,000 gates. Most CPLDs are EEPROM-based, which means that it can be reprogramed a lot of times, and it will be active immediately upon power on. Manufacturers of CPLD include Altera (http://www.altera.com), Cypress (http://www.cypress.com), Lattice (http://www.latticesemi.com), and Xilinx (http://www.xilinx.com). Most FPGAs are much larger than CPLDs. However, manufacturers of FPGAs really have a bad habit of inflating the gate count the chip can realistically fit. For example, an FPGA I use called Xilinx Spartan-II 150,000 "system gate" part (XC2S150), although Xilinx (http://www.xilinx.com) claims that part can fit 150,000 "system gates", the realistical gate count (not using vendor proprietary features) is about 30,000 to 35,000 gates. Yes, Xilinx's "system gate" inflates the realistically achievable gate count by about 5 times. Xilinx's rival Altera (http://www.altera.com) also inflates the gate count, but in my opinion, the gate count inflation seems to be about 2 to 3 times. Spartan-IIs are low-end FPGAs, but a high-end FPGA like Xilinx Virtex-II 6M system gate part (XC2V6000) should be able to fit more than 1M realistical gates, if my assumption is correct. Xilinx and Altera FPGAs are based on SRAM, so a Configuration PROM (EEPROM based) has to be attached to program the FPGA when the power is turned on. There are other FPGA manufacturers that make FPGAs based on antifuse like Actel (http://www.actel.com) and Quicklogic (http://www.quicklogic.com), but antifuse FPGA I think are hard to use for prototyping because once you program antifuse FPGAs, it cannot be programmed again. That's because in antifuse FPGAs, you burn off fuses inside the FPGA to program the FPGA. Nowadays, antifuse FPGAs are not popular as SRAM FPGAs, because their density is relatively small compared to SRAM FPGAs, and hard to use for prototyping. > (2) Hiw much are they typically if you are buy small quantities 1-5 > for prototyping? Let's say that you want to purchase one Xilinx Spartan-II 150,000 "system gate" part (XC2S150) from a Xilinx distributor. At Insight Electronics order website (http://www.insight-electronics.com/order/index.html), you can type in a part number. If you type in XC2S150, it gives you a list of various XC2S150 with different package options and speed grades. Yes, because Spartan-IIs are geared as a low-cost FPGA, it only costs between $20 to $40 per chip. You also have to purchase a Configuration PROM (XC18V01), which costs about $23. However, if you try to buy a high-end FPGA like Xilinx Virtex-II 6M system gate part (type in XC2V6000), only one chip will cost you between $4,000 to $6,000. Yes, $4,000 to $6,000 may sound a lot, but for some applications where the production volume is extremely small (let's say a specialized communication equipment), buying a couple of Virtex-IIs will likely be cheaper than fabricating a custom chip (an ASIC). Recent FPGAs use packages like PQFP or BGA, which requires a specialized equipment to solder it onto a PCB (especially BGA package). Therefore, I think it will be much more practical to purchase a prototype board with an FPGA on it. My recommendation will be that you should pick up one from Insight Electronics (http://www.insight-electronics.com/solutions/kits/xilinx/index.shtml) with Spartan-II on it. One of them is a standalone one with Spartan-II on it, and one with a Spartan-II on a PCI card (the one I use). For design tools, you can use Xilinx ISE WebPack which is free, supports all Spartan-II devices, and comes with a simulator called ModelSim XE-Starter. I think you should not initially consider paying for tools because if you paid $1,000 for tools, and you didn't like designing hardware with an FPGA afterall, you will still be holding the bag with useless software. At least a free tool won't have such a risk. Other CPLD and FPGA vendors also offer free design tools, but Xilinx is the only one that offers a free simulator. It is far easier to find bugs running a simulator than actually firing up the actual FPGA, especially as the design gets bigger and more complex. To design circuits for an FPGA, you should learn languages like Verilog or VHDL. I will recommend learning Verilog, but I am sure some people will say VHDL. It really doesn't matter which one you learn. One thing I can say from my experience is that a lot of books about Verilog and VHDL available out there are not well written. Examples inside the book are too trivial, too boring, or too difficult to understand. > > (3) Are they dense enough to build a CPU or a 3d VGA display chip > with these? or maybe multiple chips. > > Thanks, > -M. Bazaillion You can obtain some homegrown CPUs from websites like http://www.fpgacpu.org, http://www.free-ip.com, and http://www.opencores.org. Whether or not the CPUs use get from those websites are worthy playing around is another question. Those CPUs are pretty small, so they should be able to fit inside most Spartan-II based prototype boards. However, my guess is that you are thinking of whether or not an x86 processor might fit inside an FPGA. Probably a 4 to 5 years old x86 processor might fit inside one Xilinx Virtex-II 6M system gate part, and if one is not enough, you can use multiple of them. In fact, because of high cost of fabricating an ASIC, more designers use multiple of high density FPGAs to debug it initially before committing to an ASIC. That way you have less chance to making a mistake when creating an ASIC. At 0.18u process, it costs $300,000 of NRE (Non-Recurring Engineering) charge to fabricate an ASIC for volume production. At 0.13u process, NRE charge will approach $1 million . . . Even one small mistake can be fatal, because you are talking of paying another $300,000 (in 0.18u) to fabricate an ASIC for volume production. For a 3D graphics chip, again a 4 to 5 years old one should fit inside a Xilinx Virtex-II 6M system gate part, but you will have to put an external RAMDAC outside of the chip because an FPGA doesn't contain one (TI and IBM used to make high-end RAMDACs about 4 to 5 years ago). Regards, Kevin Brace (don't respond to me directly, respond within the newsgroup)Article: 36272
Good Morning, I use Synplify 6 to synthesize my project, it also perform the mapping and produce an edif file, I want to send this for Place and Route to Xilinx 3.3 , how I could perform this ?? There's an opprtunity to use synplify directly from Xilinx but this seems not work in my computer. Thanks for your help. BananaArticle: 36273
> Unfortunaltely you cannot use both the rising as the falling edge in a single process. What you can do is 1. double the clock (by using a dll) CAN YOU EXPLAIN ME BETTER HOW I COULD DO THIS ??? THANKS BANANA P.S. : MY IDEA IS TO COUNT FROM 0 TO 2 AND WITH THE SAME CIRCUIT ALSO DIVIDE A CLOCK BY THREE. HERE IT IS THE CODE REARRANGED : library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_unsigned.all; entity counter_divider_3 is port ( clk : in STD_LOGIC; reset : in STD_LOGIC; count_3 : out STD_LOGIC_VECTOR (2 downto 0); clk_div_3 : out STD_LOGIC ); end counter_divider_3; architecture counter_divider_3_arch of counter_divider_3 is begin process (clk, reset) variable count_3_internal : STD_LOGIC_VECTOR (2 downto 0); variable clk_div_3_internal : std_logic; begin if reset = '1' then count_3_internal := "000"; elsif falling_edge(clk) then if (count_3_internal = "010") then -- si deve riazzerare il contatore count_3_internal := "000"; clk_div_3_internal := '0' ; else count_3_internal := count_3_internal + 1; end if; elsif count_3_internal = "001" then clk_div_3_internal := '1' ; end if ; count_3 <= count_3_internal ; clk_div_3 <= clk_div_3_internal ; end process; end counter_divider_3_arch;Article: 36274
I will like to know effective is the Floorplanner software that comes with Xilinx ISE series of software. The design I am working on is a PCI IP core which Tsu (setup time) has to be less than 7ns, and Tval (clock to output valid) has to be less than 11ns. Currently, the worst Tsu I have is 12.974ns, and the worst Tval I have is 16.594ns. I synthesized my design with XST Verilog, and I used only automatic P&R with user constraints (Pad to Setup = 7ns, Clock to Pad = 11ns). The software I am currently using is ISE WebPack 4.1. Is it realistic to expect that I will get Tsu and Tval within 7ns and 11ns respectively if I use the Floorplanner? Will reducing fan-out during synthesis help? If so, what number (default is 100) is appropriate? Are there any other helpful synthesis/P&R options that will improve the timings? The part I am using is Spartan-II 150K system gate part speed grade -5 which comes with Insight Electronics Spartan-II PCI Development Kit, so the use of speed grade -6 is not an option. I already synthesized my design with the speed grade -6, and that improved the worst timings by 20%, but that still wasn't enough by about 15%. I already tried "Pack I/O Registers/Latches into IOBs" in MAP. If I packed IOBs for input by selecting "For Inputs Only" or "For Inputs or Outputs", it created a positive hold time, a no-no in PCI (hold time has to be 0ns in PCI). Selecting "For Outputs Only" didn't seem to improve Tval that much. If the Floorplanner is going to help, what kinds of strategies should I use to hand place the design? Regards, Kevin Brace (don't respond to me directly, respond within the newsgroup)
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z