Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Thak you very much to you all for your answers. It really helped me. I would like also opinions for which of the two technologies is better to implement the control system of a exchange-supplyer machine. Also, would be better on an Altera MAX FPGA or a 81 family microprocessor? ------------------------------------------------------------------------ ////////////////////////////////\ ("`-''-/").___..--''"`-.__ // Marcel Figuerola Estrada // `6_ 6 ) `-. ( ).`-.__.`) // pfa@tinet.fut.es // (_Y_.)' ._ ) `._ `.``-..-' // Valls - Catalunya - Europe // _..`--'_..-_/ /--'_.' ,' \//////////////////////////////// (il),-'' (li),' ((!.-' ------------------------------------------------------------------------Article: 17776
See http://www.em.avnet.com/semis/ads/virtex.html This development system card has a PCI interface, 512Kb of ZBT RAM, 16Mb of SDRAM, 2Mb of Flash, a Fire-Wire port, two USB transceivers, An RS-232 interface, and a 80MHz RAM-DAC. -Jeff Sukandar Kartadinata wrote: > I'm looking for a Virtex development board that has (supports) at least > 16MBytes (or rather 4Mwords with 32bits word width, or 2Mx64bits) of > fast SDRAM. > Even better would be a board with two memory interfaces. > > Thanks for any suggestions > Sukandar Kartadinata > sk@zkm.deArticle: 17777
See http://www.em.avnet.com/semis/ads/virtex.html This development system card has a PCI interface, 512Kb of ZBT RAM, 16Mb of SDRAM, 2Mb of Flash, a Fire-Wire port, two USB transceivers, An RS-232 interface, and a 80MHz RAM-DAC. -Jeff Sukandar Kartadinata wrote: > I'm looking for a Virtex development board that has (supports) at least > 16MBytes (or rather 4Mwords with 32bits word width, or 2Mx64bits) of > fast SDRAM. > Even better would be a board with two memory interfaces. > > Thanks for any suggestions > Sukandar Kartadinata > sk@zkm.deArticle: 17778
See http://www.em.avnet.com/semis/ads/virtex.html This development system card has a PCI interface, 512Kb of ZBT RAM, 16Mb of SDRAM, 2Mb of Flash, a Fire-Wire port, two USB transceivers, An RS-232 interface, and a 80MHz RAM-DAC. -Jeff Sukandar Kartadinata wrote: > I'm looking for a Virtex development board that has (supports) at least > 16MBytes (or rather 4Mwords with 32bits word width, or 2Mx64bits) of > fast SDRAM. > Even better would be a board with two memory interfaces. > > Thanks for any suggestions > Sukandar Kartadinata > sk@zkm.deArticle: 17779
We've used them for many years. Like any vendor, we've had our issues with them. They have always had good customer service and tech support and resolved our problems to my satisfaction. I would definately give them above average marks. -- Keith F. Jasinski, Jr. kfjasins@execpc.com Wayne Miller <wayne@aerial-imaging.com> wrote in message news:37CE019E.6A7B3003@goodware.com... > Anyone have any experience (good or bad) with FPGAs from QuickLogic? > >Article: 17780
> FPGA/PLD in fine pitch BGA or chip scale package ??? Lattice also offers many device density/architecture choices in BGA and Tqfp packages, with the recent acquisition of Vantis, Lattice has a very broad choice in 5v and 3.3v just my .02 Michael Thomas Lattice SFAE NY/NJArticle: 17781
Just a wild guess because the context is probably important. Since many logic functions have don't care inputs, at that time not all of the of the minterns need to be evaluated. Here partial may refer to an incomplete evaluation. --alvin On Wed, 1 Sep 1999 wudong99_1998@my-deja.com wrote: > I read the phase "Partial Evaluation" from many papers but I don't know > the exact meaning of it. Who can tell me the meaning or where I can > find the meaning. Thanks! > > > Sent via Deja.com http://www.deja.com/ > Share what you know. Learn what you don't. > > ########################################################### Alvin E. Toda aet@lava.net sr. engineer Phone: 1-808-455-1331 2-Sigma WEB: http://www.lava.net/~aet/2-sigma.html 1363-A Hoowali St. Pearl City, Hawaii, USAArticle: 17782
Ray Andraka wrote: > Designing to the architecture will have more bearing on whether or not you > meet speed requirements than choice of tools. The first is a requirement. The second will make the job easier. > I'd like to know how you proposed to floorplan Altera? You can use the > graphical floorplanner in the Maxplus tools, but if you iterate the design > you have to start over. It has been a couple of years since I did an Altera 10K design, but I used the MaxPlus graphical floorplanner and I don't recall having to start over. -- Phil Hays "Irritatingly, science claims to set limits on what we can do, even in principle." Carl SaganArticle: 17783
We've used Quicklogic FPGAs for a couple of years .. Man, these things are fast and easy to route ... love that SpDE tool ... main complaint is lack of flops vs. chief anti-fuse competitor (Actel). Wayne Miller wrote: > > Anyone have any experience (good or bad) with FPGAs from QuickLogic?Article: 17784
Il s'agit d'un message multivolet au format MIME. --------------0BAE0BEA9F42B28CC4FABC6D Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi all, I'm searching for a technology comparison of FPGAs families betwwen Xilinx, Altera, Actel, Quicklogic, Lucent, etc ... in terms of hardware (capacity, technology, organisation, LUTs, routing ressoures, etc ...) does anyone knows reports, thesis, papers, design notes ? thanks you Thierry GARREL E-mail : tga@magic.fr +--------------------------------------+ | MATRA SYSTEMES ET INFORMATION | | 6 avenue des tropiques - BP 80 | | 91943 COURTABOEUF CEDEX / FRANCE | | Tel : (33) 1 69 86 85 00 | | Fax : (33) 1 68 07 03 70 | | Mail : ceo@matra-des.mgn.fr | +--------------------------------------+ --------------0BAE0BEA9F42B28CC4FABC6D Content-Type: text/x-vcard; charset=us-ascii; name="tga.vcf" Content-Transfer-Encoding: 7bit Content-Description: Carte pour Thierry Garrel Content-Disposition: attachment; filename="tga.vcf" begin:vcard n:Garrel;Thierry x-mozilla-html:FALSE adr:;;;;;; version:2.1 email;internet:tga@magic.fr x-mozilla-cpt:;-30144 fn:Thierry Garrel end:vcard --------------0BAE0BEA9F42B28CC4FABC6D--Article: 17785
I now have an evaluation copy of Exemplar's LS. I've done a little with the cores offered by the FreeIP page: http://www.free-ip.com/ I downloaded the FreeDES and Free6502 source files, and had a look at them. I have several comments: 1) I can't yet get Leonardo Spectrum to do "Extended Optimization Effort" on the FreeDES design, as this computer has only 64 MB of memory, and LS seems to need about twice that. The memory is on order, I'll get it next week. Not that it really matters much, but Synplify (sum of several processes) seems to top out at 26 MB. Memory is fairly cheap compared to the tools, I would forgive a tool for needing more memory if it just did a better job. But so far, the size of resulting output from LS is inbetween the results of Synplify and FPGA Express. I will post results when I can have the tool do all optimizations for size. I also want to look at speed. 2) The Free6502 core is an interesting design, as I think it could be made to run at a rather higher clock rate. To do this, I'm going to make some suggestions for change in a step by step fashion. I suppose I might just propose a final design, but perhaps the step by step fashion will allow for a better discussion as to why I want to make these changes to speed up this design. These changes may require microcode changes: I'm not going to do that. These changes will require running a regression test to insure that the processor is still doing all instructions correctly: I would run such a regression test if it was automatic, but will not otherwise. The longest path starts at the microcode address , goes through the ALU and into the flags logic. This path involves most of the design. As it is such a large amount of logic, and as it is the critical path, let us start by adding a pipeline register. The first change I propose is to add a pipeline register between the output of the Microcode ROM and the rest of the design. This will reqire adding an extra state to the "state" statemachine that we might call "decode", may require changes to the step counter and may require microcode changes, both of which I did not do. State "decode" will always follow after "fetch". This might add a clock to every instruction, however any but will increase the clock rate from the original 28.5 MHz to something over 45 Mhz (I did xilinx par using a slower -4 part and hit 45 Mhz). In the file free6502.vhd: type STATES is (RESET1, RESET2, FETCH, DECODE, START_IRQ, START_NMI, RUN); ^add -- The main state machine process (clk, reset, nmi_event, irq_reg, i_flag, done) begin if reset='1' then state <= RESET1; elsif clk'event and clk='1' then case state is when RESET1 => state <= RESET2; when RESET2 => state <= RUN; when FETCH => state <= DECODE; -- change when DECODE=> state <= RUN; -- change -- The microcode step counter process (clk, reset, state) begin if reset='1' then step<="000"; elsif clk'event and clk='1' then case state is when RESET1 => step <= "000"; when RESET2 => step <= "000"; when FETCH => step <= "000"; when START_IRQ => step <= "000"; when START_NMI => step <= "000"; when DECODE => step <= step + 1; -- change In the file MICROCODE.VHD: architecture mc_rom_arch of mc_rom is signal addr :std_logic_vector (10 downto 0); signal DONE_c : MCT_DONE; signal ADDR_OP_c : MCT_ADDR_OP; signal DIN_LE_c : MCT_DIN_LE; signal RD_EN_c : MCT_RD_EN; signal DOUT_OP_c : MCT_DOUT_OP; signal DINT1_OP_c: MCT_DINT1_OP; signal DINT2_OP_c: MCT_DINT2_OP; signal DINT3_OP_c: MCT_DINT3_OP; signal PC_OP_c : MCT_PC_OP; signal SP_OP_c : MCT_SP_OP; signal ALU1_c : MCT_ALU1; signal ALU2_c : MCT_ALU2; signal ALU_OP_c : MCT_ALU_OP; signal A_LE_c : MCT_A_LE; signal X_LE_c : MCT_X_LE; signal Y_LE_c : MCT_Y_LE; signal FLAG_OP_c : MCT_FLAG_OP; begin -- addr <= opcode & step; -- Process(clk) begin if rising_edge(clk) then DONE <= DONE_c; ADDR_OP <= ADDR_OP_c; DIN_LE <= DIN_LE_c; RD_EN <= RD_EN_c; DOUT_OP <= DOUT_OP_c; DINT1_OP <= DINT1_OP_c; DINT2_OP <= DINT2_OP_c; DINT3_OP <= DINT3_OP_c; PC_OP <= PC_OP_c; SP_OP <= SP_OP_c; ALU1 <= ALU1_c; ALU2 <= ALU2_c; ALU_OP <= ALU_OP_c; A_LE <= A_LE_c; X_LE <= X_LE_c; Y_LE <= Y_LE_c; FLAG_OP <= FLAG_OP_c; end if; end process; U00: DONE_rom port map (addr, DONE_c); U01: ADDR_OP_rom port map (addr, ADDR_OP_c); U02: DIN_LE_rom port map (addr, DIN_LE_c); U03: RD_EN_rom port map (addr, RD_EN_c); U04: DOUT_OP_rom port map (addr, DOUT_OP_c); U05: DINT1_OP_rom port map (addr, DINT1_OP_c); U06: DINT2_OP_rom port map (addr, DINT2_OP_c); U07: DINT3_OP_rom port map (addr, DINT3_OP_c); U08: PC_OP_rom port map (addr, PC_OP_c); U09: SP_OP_rom port map (addr, SP_OP_c); U10: ALU1_rom port map (addr, ALU1_c); U11: ALU2_rom port map (addr, ALU2_c); U12: ALU_OP_rom port map (addr, ALU_OP_c); U13: A_LE_rom port map (addr, A_LE_c); U14: X_LE_rom port map (addr, X_LE_c); U15: Y_LE_rom port map (addr, Y_LE_c); U16: FLAG_OP_rom port map (addr, FLAG_OP_c); end mc_rom_arch; -- Phil Hays "Irritatingly, science claims to set limits on what we can do, even in principle." Carl SaganArticle: 17786
--------------36DBA4D1F56D149BA530A076 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi, Putting a verilog design directly into MaxplusII for a 10K40 device and having problems with LPMs. If I instantiate lpm_addsub and use the defparam to define the port width etc., it works fine if I do this from the top-level of hierarchy. If then I place the same instantiations and same defparams in a lower level of hierarchy, the compiler complains that the pin/stub names differ between the instantiation and the Function Prototype ... it thinks it's got dataa-1, where it should be dataa[LPM_WIDTH-1] ... it seems not to be picking up the defparam correctly ... Anyone got any ideas what may be causing this? Also, while we're at it (so to speak!), I have an asynchronous processor bus that I'm latching using the Verilog "control = (~nEn)? data : control" and then double clocking with the internal clock to synchronise ... is this a valid way of doing it in Altera? Not sure because I'm getting Design Doctor Warning: Unknown combinatorial latch detected at primitive 'xyz' messages from design doctor ... I get these however if I use their lpm_latch Thanks in advance, Gary Cook, Oxford, UK. --------------36DBA4D1F56D149BA530A076 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit <!doctype html public "-//w3c//dtd html 4.0 transitional//en"> <html> <tt>Hi,</tt><tt></tt> <p><tt>Putting a verilog design directly into MaxplusII for</tt> <br><tt>a 10K40 device and having problems with LPMs.</tt><tt></tt> <p><tt>If I instantiate lpm_addsub and use the defparam</tt> <br><tt>to define the port width etc., it works fine if I do</tt> <br><tt>this from the top-level of hierarchy. If then I place</tt> <br><tt>the same instantiations and same defparams in a</tt> <br><tt>lower level of hierarchy, the compiler complains that</tt> <br><tt>the pin/stub names differ between the instantiation and</tt> <br><tt>the Function Prototype ... it thinks it's got dataa-1,</tt> <br><tt>where it should be dataa[LPM_WIDTH-1] ... it seems not</tt> <br><tt>to be picking up the defparam correctly ...</tt><tt></tt> <p><tt>Anyone got any ideas what may be causing this?</tt><tt></tt> <p><tt>Also, while we're at it (so to speak!), I have an</tt> <br><tt>asynchronous processor bus that I'm latching using</tt> <br><tt>the Verilog "control = (~nEn)? data : control" and</tt> <br><tt>then double clocking with the internal clock to</tt> <br><tt>synchronise ... is this a valid way of doing it in</tt> <br><tt>Altera? Not sure because I'm getting</tt><tt></tt> <p><tt> Design Doctor Warning: Unknown combinatorial latch</tt> <br><tt> detected at primitive 'xyz'</tt><tt></tt> <p><tt>messages from design doctor ... I get these however</tt> <br><tt>if I use their lpm_latch</tt><tt></tt> <p><tt>Thanks in advance,</tt><tt></tt> <p><tt>Gary Cook,</tt> <br><tt>Oxford, UK.</tt></html> --------------36DBA4D1F56D149BA530A076--Article: 17787
Thank you for your help and sorry for my english, The more importan problem is that, I can assign more than one signal to a bidirectional bus: Activa_CPU:cpu port map(... Read_mem => BRead_mem, BusData => BDataBus, --inout port (if use BDataBus here...) ...); process(BRead_mem,...) begin if Bread_mem=’1’ then BDataBus<= BusRam; -- ...I can do that end if; end process; If I assign to the BDataBus signal in the component, I can assign another signal to BDataBus. I tried using this structure, (as the example) and adding LPM_BUSTRI into the component. Sent via Deja.com http://www.deja.com/ Share what you know. Learn what you don't.Article: 17788
Phil Hays wrote: > 1) I can't yet get Leonardo Spectrum to do "Extended Optimization > Effort" on the FreeDES design, as this computer has only 64 MB of > memory, and LS seems to need about twice that. The memory is on order, > I'll get it next week. Not that it really matters much, but Synplify > (sum of several processes) seems to top out at 26 MB. Memory is fairly > cheap compared to the tools, I would forgive a tool for needing more > memory if it just did a better job. But so far, the size of resulting > output from LS is inbetween the results of Synplify and FPGA Express. I > will post results when I can have the tool do all optimizations for > size. I also want to look at speed. I have not used Leonardo, so I cannot say specifically what it does. But I can say that as a general rule, the DES core is a difficult design for synthesis tools to handle. It would not suprise me if your machine needs 128 or 256 megabytes of RAM to work effectively. I'm using 256 meg and Synplify didn't have any problems, but for some reason FPGA Express took forever (6+ hours). > 2) The Free6502 core is an interesting design, as I think it could be > made to run at a rather higher clock rate. You're right, it could be ran faster. And you're also right in thinking that the best place to start is by splitting up the critical path with a pipeline register at the output of the microcode ROM. I considered this at one point, but quite frankly I wanted to get the design "out" rather than spend a 24+ hours rewriting the microcode. Rewriting the microcode is not difficult, but it does require intimate knowledge of the Free-6502 internals. I should probably point out that this 6502 is not a slow design, since it does run significantly faster than currently available 6502 chips. Imagine an Apple II at 28 MHz! But I also understand that faster speeds are always desirable. > To do this, I'm going to > make some suggestions for change in a step by step fashion. I think that the best forum to discuss this is in the Free-IP mailing list. There is a page on the web site that explains how to subscribe to it. The mailing list is better because we'll get into a lot of detail that is probably only of interest to 10 or 20 other people in this news group-- and completely boring to the other 2 thousand people. But, given that, I'll respond to some of your points here. If you have more, then we should take it to the mailing list. > I suppose I > might just propose a final design, but perhaps the step by step fashion > will allow for a better discussion as to why I want to make these > changes to speed up this design. For the mailing list, you might bring up the entire list of changes. With a step-by-step approach it can be difficult to keep track of some of the larger architectural issues. > These changes may require microcode > changes: I'm not going to do that. I have some tools to edit the microcode. I intended to make these tools available later, once the design is more stable. > These changes will require running a > regression test to insure that the processor is still doing all > instructions correctly: I would run such a regression test if it was > automatic, but will not otherwise. We currently do not have any tools for regression testing. I would appreciate it if someone made that for the Free-6502 core. (hint, hint :) > The first change I propose is to add a pipeline register between the > output of the Microcode ROM and the rest of the design. This will > reqire adding an extra state to the "state" statemachine that we might > call "decode", may require changes to the step counter and may require > microcode changes, both of which I did not do. State "decode" will > always follow after "fetch". This might add a clock to every > instruction, however any but will increase the clock rate from the > original 28.5 MHz to something over 45 Mhz (I did xilinx par using a > slower -4 part and hit 45 Mhz). The "decode" state, and changes to the step counter are not required. What is required is that the entire microcode ROM be rewritten. Effectively, your "decode" state would be rolled into the microcode. By adjusting for the pipeline delay in the microcode, more efficient use of the CPU core can be obtained. For instance, this would add one clock to most instructions-- but not all! It also allows some level of overlap between execution of consecutive instructions. Now that I said changes to the step counter is not required, here's one change. It would need to be expanded by one bit. I believe that the longest instruction is 8 clocks. So if we added one clock to that, it would be 9 clocks which would require a step counter that is slightly larger. But that is the only change. > Phil Hays > "Irritatingly, science claims to set limits on what > we can do, even in principle." Carl Sagan David Kessner davidk@free-ip.com http://www.free-ip.comArticle: 17789
In article <37D015E5.A0F36C54@free-ip.com>, David Kessner <davidk@free-ip.com> wrote: >Phil Hays wrote: >> 2) The Free6502 core is an interesting design, as I think it could be >> made to run at a rather higher clock rate. >You're right, it could be ran faster. And you're also >right in thinking that the best place to start is by splitting >up the critical path with a pipeline register at the output >of the microcode ROM. One issue you might consider is whether or not you're trying for an exact cycle-by-cycle duplication of the 6502. I think you need to do that for the cycle counting tricks programmers used, for example in the Atari 2600. It would be neat if you could make a drop-in replacement for the 6502 that would actually run in a classic computer (ignoring for now, the illegal op-code issue). I think your simple pipeline structure is close to what the actual 6502 did- how else could they get 2-cycle branches? Of course you have you have to get decimal mode working too... (Atari 800 floating point routines used it... unfortunately the decimal correction is in the ALU path as well) If you just want a speed increase there are other issues. For example if the average instruction is 3 cycles and the pipelining makes is 4 instead, you are not getting much of an improvement: 9.5 MIPS (28MHz/3 cycles) compared to 11.25 MIPS (45MHz/4 cycles). Only about 20% faster. The faster clock rate will require a faster memory as well, of course. Which brings up another issue- I don't really believe these speed numbers anyway since the (very huge) address output logic is in the memory path. The critical delay might be address output logic - memory read - input register setup time. Of course adding an output register is going to add even more cycles to the instructions. The microcode ROM is huge in this design (2K by who knows how many bits wide the synthesizer makes it). I bet you could fix this by having separate decode and sequencing ROMs. I think there's only 32 actual instruction sequences, so the ROM would become 256 words instead. I bet you could do even more shrinking by trying to create a more coherent internal bus structure (instead of every register having a big mux in front of it). For example, maybe share the ALU input mux for the data output mux. -- /* jhallen@world.std.com (192.74.137.5) */ /* Joseph H. Allen */ int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0) +r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2 ]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}Article: 17790
Joseph H Allen wrote: > >You're right, it could be ran faster. And you're also > >right in thinking that the best place to start is by splitting > >up the critical path with a pipeline register at the output > >of the microcode ROM. > > One issue you might consider is whether or not you're trying for an exact > cycle-by-cycle duplication of the 6502. I think you need to do that for the > cycle counting tricks programmers used, for example in the Atari 2600. It > would be neat if you could make a drop-in replacement for the 6502 that > would actually run in a classic computer (ignoring for now, the illegal > op-code issue). With the Free-6502, I didn't attempt to make it cycle accurate. The reason was simple. In order to make it cycle accurate, I could not register all inputs and outputs. This would impact the instruction rate drastically and make portability and ease of integration more difficult. Also, I figured that once you are running this thing at more than 1 or 2 MHz then making it cycle accurate was much less important. > I think your simple pipeline structure is close to what the > actual 6502 did- how else could they get 2-cycle branches? My branches are 4 clocks, not two. This is a good example of why you can't register the I/O and make it cycle-accurate. Here's a step-by-step, at the bus level, of a simple branch (BCS, for instance): Clock 1: Fetch the branch opcode from RAM Clock 2: Fetch the branch address offset from RAM Clock 3: Output the address of the next instruction. If we registered the I/O then there would be no time between Clocks 2 and 3 to calculate the new address and output it. > Of course you > have you have to get decimal mode working too... (Atari 800 floating point > routines used it... unfortunately the decimal correction is in the ALU path > as well) The ALU in the original 6502 and the Free-6502 is very different. In the original 6502 the ALU calculates normal stuff, as well as addresses. That is, when you do an indirect address the ALU does the address addition. In the Free-6502 there is a separate ALU for that. I did it this way for speed and simplicity. The original 6502 did it that way to conserve logic in its old technology. Decimal mode is the main thing on the wish list, right after having a more complete test suite. > If you just want a speed increase there are other issues. For example if > the average instruction is 3 cycles and the pipelining makes is 4 instead, > you are not getting much of an improvement: 9.5 MIPS (28MHz/3 cycles) > compared to 11.25 MIPS (45MHz/4 cycles). Only about 20% faster. Right, which was another reason why I didn't bother to pipeline the microcode ROM and rewrite the microcode. > The faster clock rate will require a faster memory as well, of course. > Which brings up another issue- I don't really believe these speed numbers > anyway since the (very huge) address output logic is in the memory path. > The critical delay might be address output logic - memory read - input > register setup time. Of course adding an output register is going to add > even more cycles to the instructions. Since all I/O are registered, when you have a 25 MHz clock you actually get 40 ns worth of time to access memory (minus the standard setup and hold times for the registers). > The microcode ROM is huge in this design (2K by who knows how many bits > wide the synthesizer makes it). If the actual microcode ROM was done as a standard ROM then you would need a 2k x 35 bit ROM. But it's not. If you did a 2kx35 ROM in Xilinx CLB's it would take over 2200 CLB's, yet the entire Free-6502 core takes about 620 CLB's. Obviously there is some really slick optimizations going on, <wink wink>. > I bet you could fix this by having separate > decode and sequencing ROMs. I think there's only 32 actual instruction > sequences, so the ROM would become 256 words instead. I'm not sure that having separate decode/sequencing ROM's would help it. The current microcode is fairly compact. > I bet you could do > even more shrinking by trying to create a more coherent internal bus > structure (instead of every register having a big mux in front of it). For > example, maybe share the ALU input mux for the data output mux. I'm sure that there is some optimization that could be done here. But, that's why the source code is public-- so people like you can come up with ways to improve it! > /* jhallen@world.std.com (192.74.137.5) */ /* Joseph H. Allen */ David Kessner davidk@free-ip.com http://www.free-ip.com/Article: 17791
In article <37D04968.20C1E1B6@free-ip.com>, David Kessner <davidk@free-ip.com> wrote: >Since all I/O are registered, when you have a 25 MHz clock you >actually get 40 ns worth of time to access memory (minus the >standard setup and hold times for the registers). Oh sorry, I was confused. There is an address output register after all- I missed the clk'event.... line. -- /* jhallen@world.std.com (192.74.137.5) */ /* Joseph H. Allen */ int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0) +r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2 ]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}Article: 17792
I just went through a synthesis round with an Insight FAE. He took my design (fit in Virtex XCV50-4 and xc4020XLA-09 when compiled by FPGA Express) and compiled it with Synplicity. It would not fit in the Virtex part (112%), and completely used up the 4020XLA (100%). I'm not sure if it met speed with Synplicity. Synopsys used 78% of the Virtex, and about 62% of the 4020 (which is interesting in itself, since the Virtex supposedly has more gates - HA!). It was a pure "logic" design with lots of registers, state machines and a few 16 bit ram register arrays. (about 20Kgates, running at 40Mhz with up to 10 or more levels of logic). In the process, he uncovered several Synplicity bugs that were no problem for Synopsys. I believed the "Synplicity is better" stuff, and had it do better on a few state machine test cases (where I had problems with Synopsys), but for the "big" design, it went south big time. I'll stick with Synopsys, thank you. Bruce David Kessner wrote in message <37C16162.64AFC906@free-ip.com>... >Phil Hays wrote: > >> muzo wrote: >> > just check-out http://www.free-ip.com/DES/index.htm and look for the >> > comparisons between Synplify and FPGA express. >> >> This comparison just isn't very meaningful. It's based on a single flaw >> in FPGA Express. > >> From the above URL: >> >> "This was traced down to the S-Boxes, which are implemented as 64x4 >> ROM's. Under FPGA Express, these took about 33 slices where Synplify >> took 10 slices." > >Seeing as I wrote that statement (from the above URL), I guess I should >put my two cents worth in here... > >The comparison of FPGA Express and Synplify, with respect to the >Free-DES core, is very meaningful but only in applications where ROM's >are implemented (which I admit is limited). This has a huge impact for >DES since there are so many ROM's, but it can also have an impact for >some CPU (microcode rom), and DSP (coefficient tables) applications. > >I did a comparison between Synplify and FPGA Express and Synplify >was better in all applications that I tried-- but the DES code was the >most drastic. Ignoring the Free-DES core for the moment, typical >improvements for Synplify was 10 to 20% less logic _OR_ a 10 to 60% >improvement in speed. This was based on UART, CPU, and Ethernet >controller designs. Of course, your mileage will vary. > >My only complaint about Synplify is that their way to do constraints seems >awful to me. It may work, but I like what FPGA Express does much >better. > > > >> If the team at Synopsis is serious about staying in business, this flaw >> will be fixed in a matter of months. > >The latest version of FPGA Express (I forget the version #, but it >is what's shipping this month with Xilinx Foundation/Alliance 2.1) >does not fix this problem. > > > >> I'm tempted to update my evaluation license for Exemplar (which I think >> has averaged as the number two tool in my past evaluations) just to try >> this design. > >That would be good. Try the Free-6502 as well. I'd like to put those >results up on the Free-IP Project web site... > > >Thanks. > >David Kessner >davidk@free-ip.com >http://www.free-ip.com > >Article: 17793
Hello, I'm just implementing some video stuff on an Altera Flex8000. At some points the chip behaves unexpected: I have a 4 word FIFO. "dblbuf" is the DMA input and has a constant value for testing pupose, "word" is the FIFO counter, "buffer" is the FIFO memory. All pins are fixed, but the device usage is below 50%. "dllc2" is about 7 Mhz. When simulating, everything works fine. In practice, the buffer sometimes drops information which should be static and constant!!! Worse, when I shift the buffer contents by 16 bits fixed, it drops information even during simulation (routing dependent). What's wrong? Is the chip so prone to routing? Is the compiler reducing something to asynchrone logic? Are the LPM modules more stable? Armin dblbuf[15..0] : dff; -- constant buffer[63..0] : dff; word [1..0] : dff; buffer[].clk = dllc2; case word[].q is when 0 => buffer[15.. 0].d = dblbuf[]; buffer[63..16].d = buffer[63..16].q; when 1 => buffer[15.. 0].d = buffer[15.. 0].q; buffer[31..16].d = dblbuf[]; buffer[63..32].d = buffer[63..32].q; when 2 => buffer[31.. 0].d = buffer[31.. 0].q; buffer[47..32].d = dblbuf[]; buffer[63..48].d = buffer[63..48].q; when 3 => buffer[47.. 0].d = buffer[47.. 0].q; buffer[63..48].d = dblbuf[]; end case;Article: 17794
Can you share your design with the group so we can have an independent verification of your reports? If it is true than a huge leap forward has been made by FPGA Express which until now was trailing badly both Synplicity and Exemplar LS. On the other hand you could try the FreeDES design at http://www.free-ip.com. The author was not able to use FPGA Express at all (synthesis took over 6 hours). What were the versions of the FPGA Express and Synplicity tools you used? Catalin Baetoniu Bruce Nepple wrote: > I just went through a synthesis round with an Insight FAE. He took my > design (fit in Virtex XCV50-4 and xc4020XLA-09 when compiled by FPGA > Express) and compiled it with Synplicity. It would not fit in the Virtex > part (112%), and completely used up the 4020XLA (100%). I'm not sure if it > met speed with Synplicity. Synopsys used 78% of the Virtex, and about 62% > of the 4020 (which is interesting in itself, since the Virtex supposedly has > more gates - HA!). > > It was a pure "logic" design with lots of registers, state machines and a > few 16 bit ram register arrays. (about 20Kgates, running at 40Mhz with up to > 10 or more levels of logic). > > In the process, he uncovered several Synplicity bugs that were no problem > for Synopsys. > > I believed the "Synplicity is better" stuff, and had it do better on a few > state machine test cases (where I had problems with Synopsys), but for the > "big" design, it went south big time. I'll stick with Synopsys, thank you. > > Bruce >Article: 17795
David Kessner wrote: > > 2) The Free6502 core is an interesting design, as I think it could be > > made to run at a rather higher clock rate. > > You're right, it could be ran faster. And you're also > right in thinking that the best place to start is by splitting > up the critical path with a pipeline register at the output > of the microcode ROM. I considered this at one point, > but quite frankly I wanted to get the design "out" rather > than spend a 24+ hours rewriting the microcode. I was hoping that it would not be necessary to rewrite microcode. My idea for keeping the existing microcode was to use the MCT_DONE to stop the sequence counter and force other microcode bits do do nothing for one cycle, accounted for by adding a DECODE state. I reran PAR with the -6 part, and the clock speed was 65 MHz without forcing registers into IOBs and 59 MHz with forcing registers into IOBs. > I should probably point out that this 6502 is not a > slow design, since it does run significantly faster than > currently available 6502 chips. Imagine an Apple II > at 28 MHz! But I also understand that faster speeds > are always desirable. Well them, imagine an Apple II at 65 MHz or more. Speed is nice: I think that a 6502 could be made to run at 133 MHz using the internal block RAMs for program storage and microcode. However, for some applications keeping the timing of instructions is important. Many Apple II games wouldn't be playable as they relied software speed to control the speed of the game. For some control application, you might even want to run at 2 MHz with exact cycle by cycle bus activity so as not to break existing software. > I think that the best forum to discuss this is in the Free-IP > mailing list. Perhaps so, for improvements to the Free6502 core. However, I do still intend to post some results here comparing Synplify and Exemplar LS. -- Phil Hays "Irritatingly, science claims to set limits on what we can do, even in principle." Carl SaganArticle: 17796
Joseph H Allen wrote: > If you just want a speed increase there are other issues. For example if > the average instruction is 3 cycles and the pipelining makes is 4 instead, > you are not getting much of an improvement: 9.5 MIPS (28MHz/3 cycles) > compared to 11.25 MIPS (45MHz/4 cycles). Only about 20% faster. Perhaps I should have waited for running PAR with the same part, and even re-did the non-pipelined case. The 28.5 MHz number came from a -6 part, and the 45 MHz number came from a slower -4 part. With a -6 part the pipelined design runs at 59 MHZ or twice as fast. It's still not clear to me if the results are comparable: PAR produces better results when you tell it to try harder, which I was doing, and I'm not sure how hard PAR was told to try for the first number. Also, the pipeline register makes the design easier to floorplan. It is far easier to floorplan registers than logic when using HDL tools, as the registers usually have fixed names, and the logic only does if you take pains to make it so. Lastly, the Xilinx Vertex (and Altera's latest) have fairly large RAMs that are pipelined, and could be used as microcode ROMs. Once the data path is cleaned up a bit, this would allow for size reduction and speed increase of the microstore. -- Phil Hays "Irritatingly, science claims to set limits on what we can do, even in principle." Carl SaganArticle: 17797
Here's a question that may or may not be an ignorant one (if it is, just point me to a FAQ that explains it). I have a circuit of unknown orign containing an Actel A1010B. This is an anti-fuse based part, I believe, set up to do address decoding. I want to make modifications to this circuit, but I don't have the original design documentation. Is it possible to read out the programming of this part, in order to understand exactly what it is doing and what changes are needed? I have not been able to find anything that describes how to do this, or even if it can be done. TIA smithArticle: 17798
If the protection fuse is 'unblown' then you won't be able to read it back no matter what. If not, then you could read back the bitstream, but it is not going to mean much to you. The translation from logic to bitstream is not generally published, and it is a monumental effort to reverse engineer. So the short answer is: you're out of luck. X wrote: > Here's a question that may or may not be an ignorant one (if it is, > just point me to a FAQ that explains it). I have a circuit of unknown > orign containing an Actel A1010B. This is an anti-fuse based part, I > believe, set up to do address decoding. I want to make modifications > to this circuit, but I don't have the original design documentation. > Is it possible to read out the programming of this part, in order to > understand exactly what it is doing and what changes are needed? I > have not been able to find anything that describes how to do this, or > even if it can be done. > > TIA > > smith -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 17799
hi, yes, the a1010b is an antifuse part having 295 logic modules. i don't know any way to read out the programming pattern, let alone make it into a schematic. visual examination of the die won't help, even with a sem. cross-sectioning isn't so easy, either, and is not practical for decoding a chip. it takes a bit of patience and some sophisticated gear to do much of anything. if the security fuse is not programmed, you can use their debug feature to see what's going on. you shift in codes and then you can monitor the output of any module you select. this is documented in the actel data book. jones ------------------------------------------------------ X wrote: > Here's a question that may or may not be an ignorant one (if it is, > just point me to a FAQ that explains it). I have a circuit of unknown > orign containing an Actel A1010B. This is an anti-fuse based part, I > believe, set up to do address decoding. I want to make modifications > to this circuit, but I don't have the original design documentation. > Is it possible to read out the programming of this part, in order to > understand exactly what it is doing and what changes are needed? I > have not been able to find anything that describes how to do this, or > even if it can be done. > > TIA > > smith
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z