Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
I don't really know about porting GCC. But have a suggestion. If you architecture is such that you can take sequential instruction and execute them together, perhaps you could just produce your own assembler that takes a sequential program for your architecture and do a simple optimization there using a simple look ahead to decide which instructions can be executed in parallel, so it's completely the assembler that makes that optimizations. This would perhaps let you use lcc if you wanted, and you get a 'free' assembler as a bonus! Ralph "Marc Van Riet" <marcvanriet@yahoo.com> wrote in message news:3e35c83e$0$27955$ba620e4c@news.skynet.be... > Hi, > > I'm writing a small processor core as a hobby. It is a small 16-bit core > with about 20 instructions, including simple ALU functions, memory read and > writes, stack operations, ... All instructions can be conditional > (depending on the value of my 'working register'). Up to three operations > can be combined in a single opcode, although not all combinations are > possible. > > I was wondering if it would be do-able for me to port the GNU C-compiler to > my own core. I guess it is possible, since the ARM has conditional > instructions too, and it has multiple instructions in one opcode too. > > But how long would that take me ? Anybody any experience in porting GNU-C to > a similar core as mine ? > > Are there any alternatives ? The LCC compiler isn't an option I think, > because their original target core doesn't have multiple-instructions or > conditional code. > > Any comments are appreciated, > Marc > > > > > >Article: 51976
Assuming it to be bad practise, I'd like to verify the possibility of switching clock inputs. There are 2 continous signals, one connected to the dedicated clock input and anotherone on a nearby pin. Both in the range of 80MHz. Now we'd like to select internally which signal is routed to the clock distributing grid. At least the compiler does not complain. Yet it does not work. Rene -- Ing.Buero R.Tschaggelar - http://www.ibrtses.com & commercial newsgroups - http://www.talkto.netArticle: 51977
Hi Rene, that's bad design practice, you're right. To make sure that both of the clocks are routed through dedicated clock distribution nets you should insert (instantiate) a so called GLOBAL buffer. The GLOBAL is an Altera primitive which allows you to route any internally generated (in your case multiplexed) clock signal on the dedicated clock routing resources. You can verify the clock routing by opening up the Max+plus II or Quartus II floorplan editor and taking a look on the equations editor. If the net is routed through dedicated resources it will look somehow like that: clock(global). Your local Altera buddy can assist you with this if something is not clear. regards, philArticle: 51978
"Marc Van Riet" <marcvanriet@yahoo.com> wrote > I was wondering if it would be do-able for me to port the GNU C-compiler to > my own core. I guess it is possible, since the ARM has conditional > instructions too, and it has multiple instructions in one opcode too. GCC's back end is very flexible and can be re-targeted to just about anything. However, if you're asking this on comp.arch.fpga (rather than one of the compiler groups) then I think it's likely that your chances of actually doing it yourself within one lifetime are about the same as mine. Think paper cats and burning barns. -- Jonathan Bromley, Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * Perl * Tcl/Tk * Verification * Project Services Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, Hampshire, BH24 1AW, UK Tel: +44 (0)1425 471223 mail: jonathan.bromley@doulos.com Fax: +44 (0)1425 471573 Web: http://www.doulos.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.Article: 51979
Hi all, i'm working on a board with a Xilinx XC9572XL CPLD. I put a 2in AND gate between input and output pins and i obtain a 5ns propagation delay. Now i need to substitute that to have a 3in AND gate: is it possible to have again 5ns propagation delay ? If i think 3in AND gate as 2layer 2in AND gate maybe i should have 10 ns ? How can i design it to have 5ns ? Thanks a lot ! -- Stefano Mora email: stefano.mora@*libero.it (remove *)Article: 51980
Hi, for my thesis I need an implementation of an 1024-bit adder for a Xilinx Virtex FPGA (XCV600). Up to now i have tried to add all bits in one step with the following approaches: - Carry-Look-Ahead-Tree: As far as i can see, this will not lead to an satisfactory result. However, this should be the fastest adder (time log(n)). - Carry-Select-Adder: I have read that the built-in carry logic is very fast and therefore i build up the adder with an 32-bit adder from the Xilinx Core Generator. I used two different approaches for the multiplexers: Firstly a BUFT based bus multiplexer and secondly a LUT based multiplexer from the synthesis tool. To sum up, the fastest adder i was able to implement lead to a period of 20 ns. As the rest of my logic is significally faster, i need to accelerate this addition at least by a factor of 2. It occured to me that it may be better to split the addition up,i.e. use more than one step (clock cycle). Are there any other approaches around or does anyone knows a way to speed up one of the adders mentioned above ? Any help is much appreciated. Best wishes, Lars. -- GnuPG public key: http://www.ida.ing.tu-bs.de/~larsu/larsu_ida_ing_tu-bs_de.keyArticle: 51981
> I never do it. But try: > 1.From Xact 1.5i generate the vhdl ouput files as you generate it for a > post P&R simulation. > 2. From HDS, import the vhd files. > 3. As the vhdl files will be based on Xilinx components, you will need > to import the Xilinx Lib too. > > Let me know > It will most likely works with the correct XILINX libraries (UNISIM, SIMPRIM). But: - Then the VHDL netlist is constructed with XILINX Atoms. I.e. it's very much unclearly, design care and changes are allmost impossible. - I don't have XST 1.5i any more ;o( Although thanks for your idea!Article: 51982
Dear Lars, Have you considered implementing a serial full adder? This only requires one slice of logic and can run very fast (as there is no carry chain). This will then link nicely to block RAM in 4096 x 1-bit aspect ratio (4 very long words per BRAM) and to the SRL16E shift registers in the slices (32 slices provides a 1024 serial store). Obviously a serial adder will take 1024 clock cycles to execute the addition (1025 with bit growth), but you do not have to wait for the whole addition to complete to start an new one. It is like the check out at a supermarket - You can be loading the check out conveyor belt at the same time than someone else is loading their shopping at the other end. With a serial adder, you can start a second addition as soon as you have resolved the LSB after one clock cycle (assuming it is pipelined). For example, implementing y=a+b+c+d+e+f+g+h in the form of an addition tree. This requires 7 adders (4 followed by 2 followed by 1). This requires just 7 slices. It will take 1024+1+1+1=1027 clock cycles to complete (assuming no bit growth). It also has the advantage that only the final answer needs to be stored. This is just one of the techniques studied on the Xstreme DSP Techniques Course which I wrote for Xilinx. http://www.xilinx.com/support/training/abstracts/v4/atp-dsp.htm I hope you can attend the course at some time, I think you would enjoy it. Yours sincerely, Ken ChapmanArticle: 51983
"Jonathan Bromley" <jonathan@oxfordbromley.u-net.com> wrote in message news:b15gm3$l5m$1$8300dec7@news.demon.co.uk... > "Marc Van Riet" <marcvanriet@yahoo.com> wrote > > I was wondering if it would be do-able for me to port the GNU C-compiler > to > > my own core. I guess it is possible, since the ARM has conditional > > instructions too, and it has multiple instructions in one opcode too. > Have you looked at lcc? It's a C compiler generator. Have a look at the Gray Research web-site, and the XR16 processor etc. Jan Gray used lcc to create a C compiler for his processor, regards Alan -- Alan Fitch HDL Consultant DOULOS - Developing Design Know-how VHDL * Verilog * SystemC * Perl * Tcl/Tk * Verification * Project Services Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, Hampshire, BH24 1AW, UK Tel: +44 (0)1425 471223 mail: alan.fitch@doulos.com Fax: +44 (0)1425 471573 Web: http://www.doulos.com The contents of this message may contain personal views which are not the views of Doulos Ltd., unless specifically stated.Article: 51984
One solution is that since u need the input to be routed onto the global clock tree, you can assign a dedicated global clk i/o pin to it and wire the output to the input in hardware. The output clock can be given to any i/o not necessarily a global clk...Article: 51985
Stefano M wrote: > Hi all, > i'm working on a board with a Xilinx XC9572XL CPLD. > I put a 2in AND gate between input and output pins and > i obtain a 5ns propagation delay. > Now i need to substitute that to have a 3in AND gate: is > it possible to have again 5ns propagation delay ? > If i think 3in AND gate as 2layer 2in AND gate maybe > i should have 10 ns ? > How can i design it to have 5ns ? > > Thanks a lot ! > -- > Stefano Mora > email: stefano.mora@*libero.it > (remove *) The basic logic feeding an XC95K macrocell output is 5 `and' terms feeding an `or' gate. The `and' terms are as wide as the number of inputs from the UIM to a function block [56 for XL ?] and so you should be able to get a 56 input `and' gate with the same delay. Interestingly, however, some synth tools *do* make wide `and's out of a chain of 2 or 3 input primitives but the fitter will optimise this down to a single gate as long as MULTI_LEVEL_LOGIC_OPTIMIZATION is set in the .ctl file.Article: 51986
There are several ways to skin this cat. You can mix carry techniques. Use the fast carry chain for reasonable sized sub-adders (say 32 bits), then use one of the fast carry techniques such as carry look-ahead, carry-skip or carry select to combine the results. Depending on your performance target, you probably will still need to do some pipelining. Another option is to use 1024 serial adders (assuming you need a result every clock cycle), then arrange the adds so that each one is started on clock after the previous. These will run very fast, as there is no carry chain to worry about, and the adder itself only occupies one slice. You will need a shift register for each one however, so it will get to be quite large if you need to do an add per clock. The high clock capability may let you do several clocks per sample, which will greatly reduce the number of instances you will need. In your case, I think the best bet may be to just use pipelining and accept the latency. Lars Unger wrote: > Hi, > > for my thesis I need an implementation of an 1024-bit adder for a Xilinx Virtex > FPGA (XCV600). Up to now i have tried to add all bits in one step with the > following approaches: > > - Carry-Look-Ahead-Tree: As far as i can see, this will not lead to an > satisfactory result. However, this should be the fastest adder (time log(n)). > > - Carry-Select-Adder: I have read that the built-in carry logic is very fast > and therefore i build up the adder with an 32-bit adder from the Xilinx Core > Generator. I used two different approaches for the multiplexers: Firstly a BUFT > based bus multiplexer and secondly a LUT based multiplexer from the synthesis > tool. > > To sum up, the fastest adder i was able to implement lead to a period of 20 ns. > As the rest of my logic is significally faster, i need to accelerate this > addition at least by a factor of 2. It occured to me that it may be better to > split the addition up,i.e. use more than one step (clock cycle). > > Are there any other approaches around or does anyone knows a way to speed up > one of the adders mentioned above ? Any help is much appreciated. > > Best wishes, > Lars. > -- > GnuPG public key: > http://www.ida.ing.tu-bs.de/~larsu/larsu_ida_ing_tu-bs_de.key -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 51987
Yes Marc, it is possible. pull gcc from somewhere and check stormy16 port (h8300 more complex, yet understandable). this looks similar to yours and easy to understand. Also, you may want to ckeck msp430 port (mspgcc.sf.net) which is simple as well. However, if you want to _implement_ the toolchain, first of all you have to write binutils and gdb support with simulator... just in order to check if gcc works :) Cheers, Dmitry. Marc Van Riet wrote: > Hi, > > I'm writing a small processor core as a hobby. It is a small 16-bit core > with about 20 instructions, including simple ALU functions, memory read > and > writes, stack operations, ... All instructions can be conditional > (depending on the value of my 'working register'). Up to three operations > can be combined in a single opcode, although not all combinations are > possible. > > I was wondering if it would be do-able for me to port the GNU C-compiler > to > my own core. I guess it is possible, since the ARM has conditional > instructions too, and it has multiple instructions in one opcode too. > > But how long would that take me ? Anybody any experience in porting GNU-C > to a similar core as mine ? > > Are there any alternatives ? The LCC compiler isn't an option I think, > because their original target core doesn't have multiple-instructions or > conditional code. > > Any comments are appreciated, > MarcArticle: 51988
Hello, Can any conflict occurs if i install 2 versions of the xilinx tool in the same machine. thanksArticle: 51989
I have a really ancient XC3020 that I can play with (I am new to FPGA and Xilinx tools). Its a PLCC and I have a socket so attaching wires (for programming and attaching a few LEDs etc.) is simple unlike the later TQFP packages. What I think I need is a .nph file for the Xilinx ISE software so that I can use my XC3020. Is this file available? if so, where? and is this all I need to be able to use Xilinx ISE Webpack software to target the XC3020? Thanks Andrew RogersArticle: 51990
You can install 2 versions. You will need to change the XILINX environment variable to point to the version you are working with each time you switch back and forth between versions. Moss Ben wrote: > Hello, > Can any conflict occurs if i install 2 versions of the xilinx tool in > the same machine. > thanks -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 51991
This is possible but it will need some manual work. 1) Install the first Xilinx software 2) Remove the Xilinx environment variable from your machine. 3) Install the second Xilinx softeware to a different directory 4) The environment variable now points to the second software. To use the first version, change the environment variable to point to it's installation path Also make sure that the softwares are both supported by your OS. Regards, Wei Moss Ben wrote: > Hello, > Can any conflict occurs if i install 2 versions of the xilinx tool in > the same machine. > thanksArticle: 51992
Hi everybody! The PCI card I started to design months ago is near completion. Thanks a lot to all the people that answered my questions in this forum! Near completion means that it provides required functionality, but many details are in the road yet. Regarding configuration, I have difficult trying to interpret what the PCI spec says about the address assignment. I recall from the spec that the computer, at power-up time, reads the configuration registers, reads the BAR0, BAR1, etc, to find if any is implemented. I implemented the BAR0 register, that means that if the host writes data to it, when it reads the data back, some of the bits are set. Then, the host decides what address to assign to my device looking for the weight of the lowest bit set, so, if when BAR0 gives back 0xfffff000, the lowest bit is the 11th, or a decoding space of 2k, right? Then, my card should end with an address multiple of 2k, in any location of the memory space, that is what I understood. So, is the base address assignment a multipass cycle? I mean, the host first writes 0xffffffff to my BAR0, the it reads the data back, then it writes the assigned base address to my BAR0? I did the following: preset my BAR0 register to 0x00000005 (the BAR0(0) bit indicates the space, memory or i/o) and get, as result, the following address assigned: 0x02000001, what could that possibly mean? A curious thing: The card is double layer, carefully laid out to minimize the lenghts (that never, never aproached to the 1.5 inches that the spec insists on using) and the number of vias thru each trace has to pass; uses a socketed PLCC84 old FPGA, that surely doesn't comply with the required 10pf per pin max. That only description could give me fever not a long time ago, thinking of the bus disasters it could cause. But, to my surprise, not even a wire 10 inches long, soldered to a via (!!!) in my card, could disrupt the normal behavior of the bus. The bus is tougher than I thought. Excuse me the lenght of the question! Mauricio LangeArticle: 51993
You need to get your overall timing margins so that your feedback gives you sampling in the very center of your data window. The Xilinx app notes do a good job of detailing the timing margin calculations. The SSTL signals used for data, address, and control are what feed the DDRs and what they deliver back to the FPGA. The feedback clock should be the same IO standard. In the app note timing margin calculations, there is an explicit mention of the overall routed length required to achieve the margin in the example (unless I'm getting my Xilinx DDR app notes mixed up and the one you're looking at missed that detail). Based on the propagation delay of the signals in your layout (internal layer signal speed is slightly slower than the delay on an outer layer), you can figure out just how much length to put in. Look at the timing budget numbers. Understand where they come from. You'll be able to get some very good timing if you pay attention to those numbers. As with any SSTL signal, you have a series resistor near the source (look at all the other SSTL signals) and a resistor to VTT at the end. That's how the signal for the feedback has to be routed for proper delay matching. Ah, yes. Virtex-II. I almost forgot about the phase shifting in the DCM. I prototyped a DDR with a Virtex-II and used the phase shift feature but the feedback is still external SSTL-II. Rather than adding the extra length, I compensated with the phase shift. In the Spartan-IIE for production I didn't have that "luxury" but by matching the lengths externally I got a tighter timing window (more margin) than the Virtex-II thanks to propagation delay matching and internal Xilinx component delay tracking. Virtex-II still needs the resistors to look like an SSTL-II even if you choose no to run the extra signal length. So take my initial rambling as an approach without the phase shift or use the last paragraph for a short external route. But do go external. "Florian" <flo78de@yahoo.de> wrote in message news:3e362717$0$3025$9b622d9e@news.freenet.de... > Hello, > i'm designing a board with a VirtexII that takes masses of data from several > inputs and stores it in 2 DDR-RAM channels. For the ram-controller i'm > orientating on xapp200. It says to feed the ddr_clk and possibly ddr_clkb > back into the fpga. My question is how to route this feedback on the pcb? I > see several possibilities: > - shortest way from pad to pad > - from the series termination resitor > - from the ram-socket > - from the parallel termination resitor > And how do i terminate this feedback? (DCI is not an option) > > Any help welcome > > Thank You in advance > Florian > > > > > > > >Article: 51994
creon100@yahoo.com (Sean) writes: > Yeah, that would work, I just think they already have quite a few of > the EPC16's laying around since they were anticipating using them, > lol. This project has been handed off to me by the way, so I'm > working with other people's schematics. Then you might as well use the EPC16 as a regular flash. However, I don't know if it will not work, but I can't see why it shouldn't. If it for some reason should not work you could emulate a FPGA configuration of your second EPC16 using some data pins on your FPGA. Petter -- ________________________________________________________________________ Petter Gustad 8'h2B | ~8'h2B http://www.gustad.com/petterArticle: 51995
"Ray Andraka" <ray@andraka.com> wrote in message news:3E3575FC.3C1C4D5F@andraka.com... > glen herrmannsfeldt wrote: > > "Ray Andraka" <ray@andraka.com> wrote in message > > news:3E32F64E.73A27950@andraka.com... > > > ... A LUT and a ROM are physically the same thing. > > > > > Last I knew (XC4000 days) it was not possible to initialize a LUT > > based RAM. The hardware couldn't guarantee against glitches on the > > R/W line at the end of initialization. So, in FPGA sense, LUT are > > ROM and not RAM. > > > > "Ray Andraka" <ray@andraka.com> > For Virtex and later, the problem with initializing RAM is no longer > there. I have the newer books, but I haven't looked for this since virtex came out. I do remember being surprised reading that the XC4000 couldn't do it. Though this reminds me of a discussion in the days of 2102's (1Kx1 static RAM), whether they would always power up to the same value. I was told that someone had thought of making a static RAM with an initial value. Then wondering if DRAM would tend to power up to the same value each time. -- glenArticle: 51996
If you use the built-in carry ( and you should) then the inter-slice carry routing is direct, and does not cause any delay beyond the one specified in the data sheet. Peter Alfke, Xilinx Applications Moss Ben wrote: > But the path from the LSB to the MSB should also be taken into > consideration. The change in delay of this path when increasing the > size of the adder will depend mainly on the route delay from a slice > to a slice. if i have understood you well, you mean that such delay > is negligeable in from of the time in,time-out of the registers. is > that right? > > John-H you speak about interactive data sheets, could u please post > the URL. > v just checked Xilinx, there is only PDF file. > > thanks > > Ray Andraka <ray@andraka.com> wrote in message news:<3E35754B.BB2C81F3@andraka.com>... > > The time to get on and off the ends of the carry chain are much greater > > than the incremental carry chain, so you may not recognize the deltas due > > to just the carry chain. Also, the carry chain is implemented with a fast > > carry lookahead at the slice level, so the speed of 1 bit vs 2 bits is > > virtually the same. > > > > Moss Ben wrote: > > > > > Hello, > > > Using Virtex > > > If we implement an N-bit adder using the carry logic > > > will not the 2N-bit adder run at half the speed of N-bit adder > > > This what i expect, but i have read that this not the cases with carry > > > logic of Virtex,i.e. its propagation delay is not linear? > > > is that true? > > > if yes, how is the propagation delay of the carry Logic > > > thanks > > > > -- > > --Ray Andraka, P.E. > > President, the Andraka Consulting Group, Inc. > > 401/884-7930 Fax 401/884-7950 > > email ray@andraka.com > > http://www.andraka.com > > > > "They that give up essential liberty to obtain a little > > temporary safety deserve neither liberty nor safety." > > -Benjamin Franklin, 1759Article: 51997
"Moss Ben" <mosaab2k@yahoo.com> wrote in message news:7014b7dd.0301271632.2c0c1fe8@posting.google.com... . . . > John-H you speak about interactive data sheets, could u please post > the URL. > v just checked Xilinx, there is only PDF file. . . . 1) xilinx.com 2) Support (top row of tabs) 3) Documentation (second row of tabs) 4) Interactive Data Sheets (under the Additional Documentation setting) http://www.xilinx.com/applications/web_ds/index_top.htm You have a few to choose from. For figuring out carry chain stuff, the interactive data sheet may not have as many numbers as the old databook I keep referring to but the most important ones are there. Sometimes running the "speedprint" command line utility for my device helps me get back to (nearly) the full selection of timing numbers. (speedprint: http://toolbox.xilinx.com/docsan/xilinx5/data/docs/dev/dev0108_18.html )Article: 51998
"Lars Unger" <larsu@ida.ing.tu-bs.de> wrote in message news:Pine.LNX.4.50.0301281035570.8058-100000@wichtel.ida.ing.tu-bs. de... > Hi, > > for my thesis I need an implementation of an 1024-bit adder for a Xilinx Virtex > FPGA (XCV600). Up to now i have tried to add all bits in one step with the > following approaches: > (snip) I would say that a pipelining approach should help. I am interested to know what kind of problem would use a 1024 bit adder. -- glenArticle: 51999
"Mauricio Lange" <weirdo@bbs.frc.utn.edu.ar> wrote in message news:2f938098.0301280802.2552a0ab@posting.google.com... > The PCI card I started to design months ago is near completion. Thanks > a lot to all the people that answered my questions in this forum! > Near completion means that it provides required functionality, but > many details are in the road yet. (snip) > > A curious thing: The card is double layer, carefully laid out to > minimize the lenghts (that never, never aproached to the 1.5 inches > that the spec insists on using) and the number of vias thru each trace > has to pass; uses a socketed PLCC84 old FPGA, that surely doesn't > comply with the required 10pf per pin max. > That only description could give me fever not a long time ago, > thinking of the bus disasters it could cause. But, to my surprise, not > even a wire 10 inches long, soldered to a via (!!!) in my card, could > disrupt the normal behavior of the bus. The bus is tougher than I > thought. Both the capacitance and wire length will be set for the maximum number of cards on the bus. If you have only one or two you can probably greatly exceed those. -- glen
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z