Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Anyone know if it is possible that I could get free samples of fpgas. Sent via Deja.com http://www.deja.com/ Before you buy.Article: 19426
Hi Michael I work on two machines Win95 PII 400MHz and DRdos 486 66Mhz. The target Cumputer is the 486. So i use only the 16Bit Version of Jam.exe. Now i have differnt versions. Jam 1.2 (JAM102), Jam 2.0 (JAM200) and Jam 2.12 (JAM212). On Win95 the dosbox close directly after Program terminated, so i does not recognize the error statment. On DRdos machine i can see the error, and i pipe it in a textfile: > jam212 -v -dDO_CONFIGURE=1 -p378 hicen1.jam >t.txt ------------- Jam (Stapl) Player Version 2.12 Copywrite (C) 1997-1999 Altera Corporation CRC matched: CRC value = 5B21 NOTE "CREATOR" = "POF to JAM converter Version 9.3 7/23/1999" NOTE "DEVICE" = "EPF10K10" NOTE "FILE" = "hicentotal.sof" NOTE "USERCODE" = "0FF100DD" NOTE "JAM_VERSION" = "1.0" NOTE "ALG_VERSION" = "2.3" Device #1 IDCODE is 010100DD configuring FLEX device(s)... Error on line 411: undefined symbol. <<----------------- Program terminated. Elapsed time = 00:00:02 ------------ All Versions of JAM???.EXE make the same Error on Line 411. From the Altera destributor in Munic i got a jamfile generated with MAX+plus 9.1. Here i have the "Error on Line 405". Now i also test the 32Bit Version of Jam.exe. Here i get memory protection error: -------- JAM verursachte einen Fehler durch eine ungültige Seite in Modul JAM.EXE bei 0167:00414dc7. Register: EAX=00000000 CS=0167 EIP=00414dc7 EFLGS=00010246 EBX=00000073 SS=016f ESP=0065fb30 EBP=0065fb34 ECX=ffffffff DS=016f ESI=0041f03e FS=69df EDX=00000010 ES=016f EDI=00000010 GS=0000 Bytes bei CS:EIP: -------- Now I have no idear what i can do that it works! Michael Stanton <mikes@magtech.com.au> schrieb in im Newsbeitrag: 385ED38E.24214D76@magtech.com.au... > Hi Thomas > > We have never had to alter any lines inside the Jam source file and have always > been able to use the .jam file produced by Max+Plus II. > > The following is the DOS command line we use to program a FLEX 10K30A as part of > a three device JTAG chain : > > jam -v -dDO_CONFIGURE=1 -p378 cpld_top.jam > > We are using Jam.exe ver 1.2 and Max+Plus II 9.3 and have a ByteBlasterMV > connected to a standard PC printer port (LPT1 at 378h) via a 2m long D25M-D25F > extension cable. > > There are two versions of the jam.exe ; 16-bit-DOS and Win95-WinNT. Have you > tried each version ? > > Can't think of anything else to try, - hope it works out for you ! > > Regards, Michael > > > Thomas Bornhaupt wrote: > > > Hi Michael, > > > > thank you for your tipps. But it doesnot work. > > > > It seemt to me, that the MAX+plus (9.3) genarates wrong JAM or JBC files. > > > > I testet Jam.EXE 1.2 with the -dDO_CONFIGURE. But the Chip is not > > programmed. > > > > Inside of the JAM-File (Language 1.1) i found this line > > > > BOOLEAN DO_CONFIGURE = 0; > > > > So i set it to > > > > BOOLEAN DO_CONFIGURE = 1; > > > > Starting JAM.EXE i got a syntax-error in line 440! > > > > Also i tested JAM.EXE 2.2. Here you have the option -aCONFIGURE. This is the > > Action out of the JAM-file (STAPL Format): > > > > ACTION CONFIGURE = PR_INIT_CONFIGURE, PR_EXECUTE; > > > > And now I got an exception. The Dosbox went direcly away and a pure > > dosmachine hang up with an EMM386 error. > > > > regards > > Thomas Bornhaupt > > >Article: 19427
elynum@my-deja.com wrote: > Anyone know if it is possible that I could > get free samples of fpgas. i would say that it is possible - of course, it depends on your negotiating skills, how much it looks like you might actually buy something, etc., etc. one company that i know of does have a policy of giving out free devices, quicklogic: With QuickLogic’s new WebASIC program, you can receive programmed FPGA and ESP devices at no cost within 24-48 hours of sending us your design data via the Internet. while we have been talking about free software for years (sort of here for some) this is the first that i've seen of making it policy for free hardware. good luck! ---------------------------------------------------------------------- rk The world of space holds vast promise stellar engineering, ltd. for the service of man, and it is a stellare@erols.com.NOSPAM world we have only begun to explore. Hi-Rel Digital Systems Design -- James E. Webb, 1968Article: 19428
Hi Bob Thanks for this suggestion. I wonder too as a further precaution might it be a good idea to release the reset on the opposite edge to that used for the FSM's - assuming of course that they all use the same edge - as otherwise, depending on the consistency of the skew between clk and sync_reset across a device/ board (as they are high fanout signals) this approach might even make things worse? regds Mike In article <385e4a2b.75584735@nntp.best.com>, bob@nospam.thanks (Bob Perlman) wrote: > My policy is to give every FSM an asynchronous reset and a synchronous > reset. The asynchronous reset puts the FSM in the right state even in > the absence of a clock, which is important if the FSM is controlling, > say, internal or external TriStates that might otherwise contend. The > synchronous reset works around the problem you mentioned (by the way, > 'slim and none' is just another phrase for, 'sooner or later, for > sure'). I do one-hot FSMs exclusively, and I apply the synchronous > reset only to the initial state FF of the FSM; I use it to (a) hold > that FF set and (b) gate off that FF's output to any other state FF. > > I create the synchronous reset with a pipeline of 3 or 4 FFs, all of > which get a global reset. A HIGH is fed to the D of the first FF, and > gets propagated to the end of the chain after reset is released. The > output of the last FF is inverted to produce the active HIGH > synchronous reset. For devices that support global sets, you can just > set all the FFs, feed a LOW into the first FF, and dispense with the > inverter at the end. It's important to clock this FF chain with the > same clock used for the FSM, of course. > > There are other ways to work around this problem, such as adding extra > do-nothing states after the initial states in a one-hot, or making > sure that the FSM won't transition out of the initial state until a > few cycles after the asynch reset has been released. These work, too. > The method I've described is easy to do in either schematics or HDL > and, if desired, allows you to easily synchronize the startup of > multiple FSMs. > > Take care, > Bob Perlman > Sent via Deja.com http://www.deja.com/ Before you buy.Article: 19429
And once you get it, I guess you'll need free PWB layout, fab and assembly? Most of the modern packaging is not well suited for hobbyist work - fine pitch quad flat packs and ball grid arrays take special techniques to mount on the board. rk wrote: > elynum@my-deja.com wrote: > > > Anyone know if it is possible that I could > > get free samples of fpgas. > > i would say that it is possible - of course, it depends on your > negotiating skills, how much it looks like you might actually buy > something, etc., etc. > > one company that i know of does have a policy of giving out free > devices, quicklogic: > > With QuickLogic’s new > WebASIC program, you can > receive programmed FPGA and > ESP devices at no cost within > 24-48 hours of sending us your > design data via the Internet. > > while we have been talking about free software for years (sort of here > for some) this is the first that i've seen of making it policy for free > hardware. > > good luck! > > ---------------------------------------------------------------------- > rk The world of space holds vast promise > stellar engineering, ltd. for the service of man, and it is a > stellare@erols.com.NOSPAM world we have only begun to explore. > Hi-Rel Digital Systems Design -- James E. Webb, 1968 -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 19430
In <385E7128.4335A25A@ids.net> Ray Andraka <randraka@ids.net> writes: Thanks Ray, if i start it in a simple way, say in next way: instead of writing all the difficult search stuff in FPGA, i only write the evaluation in FPGA. This can be done in parallel of course to a big extend. So it's smart to do it in fpga. If this can be done under 1 usec, then that would be great. Even 2 usec is acceptible. If it's needing more than 10 usec then that would suck bigtime. Basically it must be able to evaluate 200,000 times a second. I bet that this is technical a lot simpler than writing the whole search with big amountof memory in fpga. My evaluation is gigantic though. First i generate a datastructure with a lot of information, which is later used everywhere in the evaluation, so that's at least some clocks extra. If i'm not mistaken in FPGA the eval at 50Mhz may need in totally 100 clocks to still deliver 500k evaluations a second, right? What kind of components would i need for this? How about a PCI card, are those already available? > >Vincent Diepeveen wrote: > >> On Sat, 18 Dec 1999 12:50:33 -0500, Ray Andraka <randraka@ids.net> >> wrote: >> >> > >> > >> >Dann Corbit wrote: >> > >> >> "Ray Andraka" <randraka@ids.net> wrote in message >> >> news:385B1DEE.7517AAC7@ids.net... >> >> > The chess processor as you describe would be sensible in an FPGA. Current >> >> > offerings have extraordinary logic densities, and some of the newer FPGAs >> >> have >> >> > over 500K of on-chip RAM which can be arranged as a very wide memory. >> >> Some of >> >> > the newest parts have several million 'marketing' gates available too. >> >> FPGAs >> >> > have long been used as prototyping platforms for custom silicon. >> >> >> >> I am curious about the memory. Chess programs need to access at least tens >> >> of megabytes of memory. This is used for the hash tables, since the same >> >> areas are repeatedly searched. Without a hash table, the calculations must >> >> be performed over and over. Some programs can even access gigabytes of ram >> >> when implemented on a mainframe architecture. Is very fast external ram >> >> access possible from FPGA's? >> > >> >This is conventional CPU thinking. With the high degree of parallelism in the >> >> No this is algorithmic speedup design. >> > >What I meant by this is that just using the FPGA to accelerate the CPU algorithm >isn't necessarily going to give you all the FPGA is capable of doing. You need to >rethink some of the algorithm to optimize it to the resources you have available in >the FPGA. The algorithm as it stands now is at least somewhat tailored to a cpu >implementation. It appears your thinking is jsut using the FPGA to speed up the >inner loop, where what I am proposing is to rearrange the algorithm so that the FPGA >might for example look at the whole board state on the current then next move. In a >CPU based algorithm, the storage is cheap and the computation is expensive. In an >FPGA, you have an opportunity for very wide parallel processes (you can even send a >lock signal laterally across process threads). Here the processing is generally >cheaper than the storage of intermediate results. The limiting factor is often the >I/O bandwidth, so you want to rearrange your algorithm to tailor it to the quite >different limitations of the FPGA. > >> Branching factor (time multiplyer to see another move ahead) >> gets better with it by a large margin. >> >> So BF in the next formula gets better >> >> # operations in FGPA = C * (BF^n) >> where n is a positive integer. >> >> >FPGA and the large amount of resources in some of the more recent devices, it >> >may very well be that it is more advantageous to recompute the values rather >> >than fetching them. There may even be a better approach to the algorithm that >> >just isn't practical on a conventional CPU. Early computer chess did not use >> >the huge memories. I suspect the large memory is more used to speed up the >> >processing rather than a necessity to solving the problem. >> >> Though #operations used by deep blue was incredible compared to >> any program of today at world championship 1999 many programs searched >> positionally deeper (deep blue 5 to 6 moves ahead some programs >> looking there 6-7 moves ahead). >> >> This all because of these algorithmic improvements. >> >> It's like comparing bubblesort against merge sort. >> You need more memory for merge sort as this is not in situ but >> it's O (n log n). Take into account that in computergames the >> option to use an in situ algorithm is not available. >> >> >> > If I were doing such I design in an FPGA however, I would look deeper to >> >> see >> >> > what algorithmic changes could be done to take advantage of the >> >> parallelism >> >> > offered by the FPGA architecture. Usually that means moving away from a >> >> > traditional GP CPU architecture which is limited by the inherently serial >> >> > instruction stream. If you are trying to mimic the behavior of a CPU, you >> >> would >> >> > possibly do better with a fast CPU, as you will get be able to run those >> >> at a >> >> > higher clock rate. The FPGA gains an advantage over CPUs when you can >> >> take >> >> > advantage of parallelism to get much more done in a clock cycle than you >> >> can >> >> > with a CPU. >> >> >> >> The ability to do many things at once may be a huge advantage. I don't >> >> really know anything about FPGA's, but I do know that in chess, there are a >> >> large number of similar calcutions that take place at the same time. The >> >> more things that can be done in parallel, the better. >> > >> >Think of it as a medium for creating a custom logic circuit. A conventional CPU >> >is specific hardware optimized to perform a wide variety of tasks, none >> >especially well. Instead we can build a circuit the specifically addresses the >> >chess algorithms at hand. Now, I don't really know much about the algorithms >> >used for chess. I suspect one would look ahead at all the possibilities for at >> >least a few moves ahead and assign some metric to each to determine the one with >> >the best likely cost/benefit ratio. The FPGA might be used to search all the >> >possible paths in parallel. >> >> My program allows parallellism. i need bigtime locking for this, in >> order to balance the parallel paths. >> >> How are the possibilities in FPGA to press several of the same program >> at one cpu, so that inside the FPGA there is a sense of parallellism? >> >> How about making something that enables to lock within the FPGA? >> >> It's not possible my parallellism without locking, as that's the same >> bubblesort versus merge sort story, as 4 processors my program gets >> 4.0 speedup, but without the locking 4 processors would be a >> lot slower than a single sequential processor. >> >> >> > That said, I wouldn't recommend that someone without a sound footing in >> >> > synchronous digital logic design take on such a project. Ideally the >> >> designer >> >> > for something like this is very familiar with the FPGA architecture and >> >> tools >> >> > (knows what does and doesn't map efficiently in the FPGA architecture), >> >> and is >> >> > conversant in computer architecture and design and possibly has some >> >> pipelined >> >> > signal processing background (for exposure to hardware efficient >> >> algorithms, >> >> > which are usually different than ones optimized for software). >> >> I am just curious about feasibility, since someone raised the question. I >> >> would not try such a thing by myself. >> >> >> >> Supposing that someone decided to do the project (however) what would a >> >> rough ball-park guestimate be for design costs, the costs of creating the >> >> actual masks, and production be for a part like that? >> > >> >The nice thing about FPGAs is that there is essentially no NRE or fabrication >> >costs. The parts are pretty much commodity items, purchased as generic >> >components. The user develops a program consisting of a compiled digital logic >> >design, which is then used to field customize the part. Some FPGAs are >> >programmed once during the product manufacturer (one time programmables include >> >Actel and Quicklogic). Others, including the Xilinx line, have thousands of >> >registers that are loaded up by a bitstream each time the device is powered up. >> >The bitstream is typically stored in an external EPROM memory, or in some cases >> >supplied by an attached CPU. Part costs range from under $5 for small arrays to >> >well over $1000 for the newest largest fastest parts. >> >> How about a program that's having thousands of chessrules and >> incredible amount of loops within them and a huge search, >> >> So the engine & eval only equalling 1.5mb of C source code. >> >> How expensive would that be, am i understaning here that >> i need for every few rules to spent another $1000 ? > >It really depends on the implementation. The first step in finding a good FPGA >implementation is repartitioning the algorithm. This ground work is often the >longest part of the FPGA design cycle, and it is a part that is not even really >acknowledged in the literature or by the part vendors. Do the system work up front >to optimize the architecture for the resoucrces you have available, and in the end >you will wind up with something much better, faster, and smaller than anything >arrived at by simple translation. > >At one extreme, one could just us the FPGA to instantiate custom CPUs with a >specialized instruction set for the chess program. That approach would likely net >you less performance than an emulator for the custom CPU running on a modern >machine. The reason for that is the modern CPUs are clocked at considerably higher >clock rates than a typical FPGA design is capable of, so even if the emulation takes >an average of 4 or 5 cycles for each custom instruction, it will still keep up with >or outperform the FPGA. Where the FPGA gets its power is the ability to do lots of >stuff at the same time. To take advantage of that, you usually need to get away >from an instruction based processor. > > > >> >> >> >The design effort for the logic circuit you are looking at is not trivial. For >> >the project you describe, the bottom end would probably be anywhere from 12 >> >weeks to well over a year of effort depending on the actual complexity of the >> >design, the experience of the designer with the algorithms, FPGA devices and >> >tools. >> >> I needed years to write it in C already... >> >> Vincent Diepeveen >> diep@xs4all.nl >> >> >> -- >> >> C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html >> >> "The C-FAQ Book" ISBN 0-201-84519-9 >> >> C.A.P. Newsgroup http://www.dejanews.com/~c_a_p >> >> C.A.P. FAQ: ftp://38.168.214.175/pub/Chess%20Analysis%20Project%20FAQ.htm >> >> >-- >> >-Ray Andraka, P.E. >> >President, the Andraka Consulting Group, Inc. >> >401/884-7930 Fax 401/884-7950 >> >email randraka@ids.net >> >http://users.ids.net/~randraka > > > >-- >-Ray Andraka, P.E. >President, the Andraka Consulting Group, Inc. >401/884-7930 Fax 401/884-7950 >email randraka@ids.net >http://users.ids.net/~randraka > > -- +----------------------------------------------------+ | Vincent Diepeveen email: vdiepeve@cs.ruu.nl | | http://www.students.cs.ruu.nl/~vdiepeve/ | +----------------------------------------------------+Article: 19431
Hi friends, am trying to use am29lv800b, but redy/bysy pin is ever 0 (after reset too). Have anybody ideas what could be a cause (or the name of appropriate newsgroop). I would like also to know, have somebody used this part in designs. * Sent from RemarQ http://www.remarq.com The Internet's Discussion Network * The fastest and easiest way to search and participate in Usenet - Free!Article: 19432
Hi Bonio, I use the 29F800B-chip. The Ready/busy Pin is Open-Collector (i remember). Did you connect an Pull-UP Regards, Lutz Bonio Lopez schrieb in Nachricht <3fc5848e.be18dcdb@usw-ex0102-009.remarq.com>... >Hi friends, >am trying to use am29lv800b, >but redy/bysy pin is ever 0 (after reset too). >Have anybody ideas what could be a cause (or the name of appropriate >newsgroop). >I would like also to know, have somebody used this part in designs. > > > >* Sent from RemarQ http://www.remarq.com The Internet's Discussion Network * >The fastest and easiest way to search and participate in Usenet - Free! >Article: 19433
That would be one possible partition. You can probably pipeline the evaluation so that you might have several evaluations in progress at once. That way you can get more than the 100 clocks per evaluation in the case you mention below. Again, I'd have to sit down and noodle over the algorithms to get a really good partition and implementation. For hardware, your best bet would probably be to buy one of the commercially available boards out there. Many have PCI interfaces, some of which use the FPGA for the PCI. Check out www.optimagic.com for a pretty comprehensive listing of available boards. You'll want to partition the algorithm between the processor and the FPGA before you make a final selection of the board so that you make sure you have the right amount of and connections to external memory for the application. There are nearly as many board architectures as there are boards. Vincent Diepeveen wrote: > In <385E7128.4335A25A@ids.net> Ray Andraka <randraka@ids.net> writes: > > Thanks Ray, > > if i start it in a simple way, say in next way: > > instead of writing all the difficult search stuff in FPGA, > i only write the evaluation in FPGA. This can be done in parallel > of course to a big extend. > So it's smart to do it in fpga. > > If this can be done under 1 usec, then that would be great. > Even 2 usec is acceptible. > If it's needing more than 10 usec then that would suck bigtime. > > Basically it must be able to evaluate 200,000 times a second. > > I bet that this is technical a lot simpler than writing the whole search with > big amountof memory in fpga. > > My evaluation is gigantic though. First i generate a datastructure > with a lot of information, which is later used everywhere in the evaluation, > so that's at least some clocks extra. > > If i'm not mistaken in FPGA the eval at 50Mhz may need in totally 100 clocks > to still deliver 500k evaluations a second, right? > > What kind of components would i need for this? > > How about a PCI card, are those already available? > > > > > > >Vincent Diepeveen wrote: > > > >> On Sat, 18 Dec 1999 12:50:33 -0500, Ray Andraka <randraka@ids.net> > >> wrote: > >> > >> > > >> > > >> >Dann Corbit wrote: > >> > > >> >> "Ray Andraka" <randraka@ids.net> wrote in message > >> >> news:385B1DEE.7517AAC7@ids.net... > >> >> > The chess processor as you describe would be sensible in an FPGA. Current > >> >> > offerings have extraordinary logic densities, and some of the newer FPGAs > >> >> have > >> >> > over 500K of on-chip RAM which can be arranged as a very wide memory. > >> >> Some of > >> >> > the newest parts have several million 'marketing' gates available too. > >> >> FPGAs > >> >> > have long been used as prototyping platforms for custom silicon. > >> >> > >> >> I am curious about the memory. Chess programs need to access at least tens > >> >> of megabytes of memory. This is used for the hash tables, since the same > >> >> areas are repeatedly searched. Without a hash table, the calculations must > >> >> be performed over and over. Some programs can even access gigabytes of ram > >> >> when implemented on a mainframe architecture. Is very fast external ram > >> >> access possible from FPGA's? > >> > > >> >This is conventional CPU thinking. With the high degree of parallelism in the > >> > >> No this is algorithmic speedup design. > >> > > > >What I meant by this is that just using the FPGA to accelerate the CPU algorithm > >isn't necessarily going to give you all the FPGA is capable of doing. You need to > >rethink some of the algorithm to optimize it to the resources you have available in > >the FPGA. The algorithm as it stands now is at least somewhat tailored to a cpu > >implementation. It appears your thinking is jsut using the FPGA to speed up the > >inner loop, where what I am proposing is to rearrange the algorithm so that the FPGA > >might for example look at the whole board state on the current then next move. In a > >CPU based algorithm, the storage is cheap and the computation is expensive. In an > >FPGA, you have an opportunity for very wide parallel processes (you can even send a > >lock signal laterally across process threads). Here the processing is generally > >cheaper than the storage of intermediate results. The limiting factor is often the > >I/O bandwidth, so you want to rearrange your algorithm to tailor it to the quite > >different limitations of the FPGA. > > > >> Branching factor (time multiplyer to see another move ahead) > >> gets better with it by a large margin. > >> > >> So BF in the next formula gets better > >> > >> # operations in FGPA = C * (BF^n) > >> where n is a positive integer. > >> > >> >FPGA and the large amount of resources in some of the more recent devices, it > >> >may very well be that it is more advantageous to recompute the values rather > >> >than fetching them. There may even be a better approach to the algorithm that > >> >just isn't practical on a conventional CPU. Early computer chess did not use > >> >the huge memories. I suspect the large memory is more used to speed up the > >> >processing rather than a necessity to solving the problem. > >> > >> Though #operations used by deep blue was incredible compared to > >> any program of today at world championship 1999 many programs searched > >> positionally deeper (deep blue 5 to 6 moves ahead some programs > >> looking there 6-7 moves ahead). > >> > >> This all because of these algorithmic improvements. > >> > >> It's like comparing bubblesort against merge sort. > >> You need more memory for merge sort as this is not in situ but > >> it's O (n log n). Take into account that in computergames the > >> option to use an in situ algorithm is not available. > >> > >> >> > If I were doing such I design in an FPGA however, I would look deeper to > >> >> see > >> >> > what algorithmic changes could be done to take advantage of the > >> >> parallelism > >> >> > offered by the FPGA architecture. Usually that means moving away from a > >> >> > traditional GP CPU architecture which is limited by the inherently serial > >> >> > instruction stream. If you are trying to mimic the behavior of a CPU, you > >> >> would > >> >> > possibly do better with a fast CPU, as you will get be able to run those > >> >> at a > >> >> > higher clock rate. The FPGA gains an advantage over CPUs when you can > >> >> take > >> >> > advantage of parallelism to get much more done in a clock cycle than you > >> >> can > >> >> > with a CPU. > >> >> > >> >> The ability to do many things at once may be a huge advantage. I don't > >> >> really know anything about FPGA's, but I do know that in chess, there are a > >> >> large number of similar calcutions that take place at the same time. The > >> >> more things that can be done in parallel, the better. > >> > > >> >Think of it as a medium for creating a custom logic circuit. A conventional CPU > >> >is specific hardware optimized to perform a wide variety of tasks, none > >> >especially well. Instead we can build a circuit the specifically addresses the > >> >chess algorithms at hand. Now, I don't really know much about the algorithms > >> >used for chess. I suspect one would look ahead at all the possibilities for at > >> >least a few moves ahead and assign some metric to each to determine the one with > >> >the best likely cost/benefit ratio. The FPGA might be used to search all the > >> >possible paths in parallel. > >> > >> My program allows parallellism. i need bigtime locking for this, in > >> order to balance the parallel paths. > >> > >> How are the possibilities in FPGA to press several of the same program > >> at one cpu, so that inside the FPGA there is a sense of parallellism? > >> > >> How about making something that enables to lock within the FPGA? > >> > >> It's not possible my parallellism without locking, as that's the same > >> bubblesort versus merge sort story, as 4 processors my program gets > >> 4.0 speedup, but without the locking 4 processors would be a > >> lot slower than a single sequential processor. > >> > >> >> > That said, I wouldn't recommend that someone without a sound footing in > >> >> > synchronous digital logic design take on such a project. Ideally the > >> >> designer > >> >> > for something like this is very familiar with the FPGA architecture and > >> >> tools > >> >> > (knows what does and doesn't map efficiently in the FPGA architecture), > >> >> and is > >> >> > conversant in computer architecture and design and possibly has some > >> >> pipelined > >> >> > signal processing background (for exposure to hardware efficient > >> >> algorithms, > >> >> > which are usually different than ones optimized for software). > >> >> I am just curious about feasibility, since someone raised the question. I > >> >> would not try such a thing by myself. > >> >> > >> >> Supposing that someone decided to do the project (however) what would a > >> >> rough ball-park guestimate be for design costs, the costs of creating the > >> >> actual masks, and production be for a part like that? > >> > > >> >The nice thing about FPGAs is that there is essentially no NRE or fabrication > >> >costs. The parts are pretty much commodity items, purchased as generic > >> >components. The user develops a program consisting of a compiled digital logic > >> >design, which is then used to field customize the part. Some FPGAs are > >> >programmed once during the product manufacturer (one time programmables include > >> >Actel and Quicklogic). Others, including the Xilinx line, have thousands of > >> >registers that are loaded up by a bitstream each time the device is powered up. > >> >The bitstream is typically stored in an external EPROM memory, or in some cases > >> >supplied by an attached CPU. Part costs range from under $5 for small arrays to > >> >well over $1000 for the newest largest fastest parts. > >> > >> How about a program that's having thousands of chessrules and > >> incredible amount of loops within them and a huge search, > >> > >> So the engine & eval only equalling 1.5mb of C source code. > >> > >> How expensive would that be, am i understaning here that > >> i need for every few rules to spent another $1000 ? > > > >It really depends on the implementation. The first step in finding a good FPGA > >implementation is repartitioning the algorithm. This ground work is often the > >longest part of the FPGA design cycle, and it is a part that is not even really > >acknowledged in the literature or by the part vendors. Do the system work up front > >to optimize the architecture for the resoucrces you have available, and in the end > >you will wind up with something much better, faster, and smaller than anything > >arrived at by simple translation. > > > >At one extreme, one could just us the FPGA to instantiate custom CPUs with a > >specialized instruction set for the chess program. That approach would likely net > >you less performance than an emulator for the custom CPU running on a modern > >machine. The reason for that is the modern CPUs are clocked at considerably higher > >clock rates than a typical FPGA design is capable of, so even if the emulation takes > >an average of 4 or 5 cycles for each custom instruction, it will still keep up with > >or outperform the FPGA. Where the FPGA gets its power is the ability to do lots of > >stuff at the same time. To take advantage of that, you usually need to get away > >from an instruction based processor. > > > > > > > >> > >> > >> >The design effort for the logic circuit you are looking at is not trivial. For > >> >the project you describe, the bottom end would probably be anywhere from 12 > >> >weeks to well over a year of effort depending on the actual complexity of the > >> >design, the experience of the designer with the algorithms, FPGA devices and > >> >tools. > >> > >> I needed years to write it in C already... > >> > >> Vincent Diepeveen > >> diep@xs4all.nl > >> > >> >> -- > >> >> C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html > >> >> "The C-FAQ Book" ISBN 0-201-84519-9 > >> >> C.A.P. Newsgroup http://www.dejanews.com/~c_a_p > >> >> C.A.P. FAQ: ftp://38.168.214.175/pub/Chess%20Analysis%20Project%20FAQ.htm > >> > >> >-- > >> >-Ray Andraka, P.E. > >> >President, the Andraka Consulting Group, Inc. > >> >401/884-7930 Fax 401/884-7950 > >> >email randraka@ids.net > >> >http://users.ids.net/~randraka > > > > > > > >-- > >-Ray Andraka, P.E. > >President, the Andraka Consulting Group, Inc. > >401/884-7930 Fax 401/884-7950 > >email randraka@ids.net > >http://users.ids.net/~randraka > > > > > -- > +----------------------------------------------------+ > | Vincent Diepeveen email: vdiepeve@cs.ruu.nl | > | http://www.students.cs.ruu.nl/~vdiepeve/ | > +----------------------------------------------------+ -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 19434
I will see. I think it coud be a cause of such Re/nBY behavoir. (Danke fuer Hinweis). And now the second problem I cant understand: If the reset= '1' ( working mode) the data bus get the same values as address bus by read operation. (nCE=0,nOE='0',nWE='1',reset='1'). I have absolutly no ideas, what coud it be. May be you can point me my mistake? * Sent from RemarQ http://www.remarq.com The Internet's Discussion Network * The fastest and easiest way to search and participate in Usenet - Free!Article: 19435
rk <stellare@nospam.erols.com> wrote in message news:385EE12E.F6315E86@nospam.erols.com... > perhaps another one ... > > no self-clearing structures (f-f output to it's clear). find another way to make > that pulse! i've seen this too many times! I've used a form of this when interfacing an FPGA to a DSP's bus (both run off different clocks). The DSP's clock registers a write strobe in flip flop #1 (while the data bus contents are registered in a bunch of other registers). This is followed by two flip flops, #2 and #3 (#2 acting as a synchornizer, more or less) clocked by the FPGA's clock. The output of #3 goes to the FPGA's state machines, etc., and also goes back to the asynchronous reset input on 'flop #1. (Note the the DSP is slow compared to the FPGA; the output of 'flop #3 fires long before the DSP gets around to sending another write strobe.) Is this dangerous (doesn't seem to me to be...)? ...and what's a better alternative? Thanks... ---Joel KolstadArticle: 19436
Hi - On Fri, 17 Dec 1999 09:41:50 -0800, Peter Alfke <peter@xilinx.com> wrote: <stuff snipped> >To the buyer, this may look like a bargain, getting a Porsche for the >price of a VW. But it can get the design in trouble if "dirty asynchronous >tricks" were used. >Always assume that the part you buy tomorrow may be faster than the one >you bought yesterday. That's why we are advocating synchronous design >methods... And it's a good thing to advocate. Many problems in FPGAs can be traced to undisciplined asynchronous design techniques whose principal advantage is that they make for good anecdotes. I thought I'd mention one place where speed increases cause problems in synchronous systems: hold times. Most FPGA vendors guarantee internal hold margins by design. If you're sending a signal from one flip-flop to another inside an FPGA and both flops are driven by the same (global) clock, reliable data transfer is guaranteed, even if the part gets faster. What's *not* guaranteed is reliable clocking from chip to chip (FPGA to FPGA, FPGA to something else, or something else to FPGA). What happens if, say, you've got an FPGA from a -X speed bin driving another FPGA from a -X speed bin, and the first FPGA is actually a much faster part that's been restamped -X because the vendor had an over-abundance of fast parts? Maybe you have a hold time problem, and maybe you don't. How do you get around this? Well, here are a few ideas: a) First and foremost, calculate hold time margins for all board-level interfaces. Many designers calculate setup margins, but overlook hold margins. I don't know why this is. Maybe they tried calculating them once and got results that were too depressing. b) Get serious about reducing board-level clock skew. This includes using low-skew clock buffers and matching clock trace lengths. c) Use receiving devices whose input hold time is 0. This doesn't eliminate hold problems (clock skew can still get you), but it sure helps. (Aside: some bus interface parts have *terrible* hold times. My theory is that every few years, the companies that make such parts round up all the people who know anything about setup/hold time issues and fire them.) d) When you calculate hold time margins, you'll need to estimate the minimum delay for the clock-to-data-output path in the driving device. When making this estimate, make sure you assume that the driving device is from the fastest available speed grade. If, for example, you use a GateMeister -3 FPGA to drive a signal, and the fastest available speed grade for that part is a -7, calculate a minimum delay based on the -7 speed grade. (I use 20 or 25% of max, depending on how conservative I'm feeling. Of course, if the vendor gives me a guaranteed minimum for the -7, I use that.) The problem I've described above has been around for ages, and not just for FPGAs; it's everywhere. Nor do I think it's unreasonable for vendors to put faster parts in slower speed grades. If customers want to make a speed/cost tradeoff, vendors must be given a way of reliably producing parts that they can sell in each grade, and re-marking parts becomes a necessity. Bob Perlman ----------------------------------------------------- Bob Perlman Cambrian Design Works Digital Design, Signal Integrity http://www.best.com/~bobperl/cdw.htm Send e-mail replies to best<dot>com, username bobperl -----------------------------------------------------Article: 19437
rk <stellare@nospam.erols.com> wrote in message news:385F5A54.FB6044B1@nospam.erols.com... > one company that i know of does have a policy of giving out free > devices, quicklogic: ...because they're fused based devices! This is one of the ways they compete with the Xilinx and Alteras of the world, where even "I don't need no stinkin' simulation" designers can eventually get a design up and running "crash and burn" style since they're just reprogramming SRAM all day. The other way they compete is on speed, of course. (I'm a little surprised that the Xilinx hardwire devices aren't a lot faster than the fastest FPGA speed grade available.) ---Joel KolstadArticle: 19438
Mike - On Tue, 21 Dec 1999 14:25:55 GMT, micheal_thompson@my-deja.com wrote: >Hi Bob > >Thanks for this suggestion. >I wonder too as a further precaution might it be a good idea to release >the reset on the opposite edge to that used for the FSM's - assuming of >course that they all use the same edge - as otherwise, depending on the >consistency of the skew between clk and sync_reset across a device/ >board (as they are high fanout signals) this approach might even make >things worse? > >regds >Mike Using opposite-edge clocking would increase hold time margin, but at the expense of setup margin. If both the synchronous reset generator circuit and the FSM are inside the same FPGA, this isn't a good tradeoff. As a rule, FPGA vendors guarantee that a signal clocked from one FF to another FF inside an FPGA will not have a hold problem, provided that the two FFs are clocked by the same global clock. In other words, increasing hold margin in such a situation doesn't buy us anything. However, if the sync reset signal goes to a lot of places, there's always the potential for a setup margin problem. I have two ways of getting around such problems: - I feed the sync reset to as few loads as possible (that's why I feed sync reset only to the initial state FF and its state transition logic). - The last FF in the sync reset generator can be duplicated. Instead of a single FF that feeds all sync reset destinations, you have N FFs whose D inputs all get the same signal from the previous FF. Each of these FFs generates the same sync reset signal. The FPGA's place/route tool has the freedom to place these FFs close to their destinations. If you're distributing synchronous reset at a board level, then there may be enough clock skew to justify the opposite-edge approach; calculating the hold margin would tell you. I've never tried distributing such a reset among multiple chips; I distribute the async reset, then generate a synchronous reset on each FPGA. Take care, Bob Perlman > >In article <385e4a2b.75584735@nntp.best.com>, > bob@nospam.thanks (Bob Perlman) wrote: >> My policy is to give every FSM an asynchronous reset and a synchronous >> reset. The asynchronous reset puts the FSM in the right state even in >> the absence of a clock, which is important if the FSM is controlling, >> say, internal or external TriStates that might otherwise contend. The >> synchronous reset works around the problem you mentioned (by the way, >> 'slim and none' is just another phrase for, 'sooner or later, for >> sure'). I do one-hot FSMs exclusively, and I apply the synchronous >> reset only to the initial state FF of the FSM; I use it to (a) hold >> that FF set and (b) gate off that FF's output to any other state FF. >> >> I create the synchronous reset with a pipeline of 3 or 4 FFs, all of >> which get a global reset. A HIGH is fed to the D of the first FF, and >> gets propagated to the end of the chain after reset is released. The >> output of the last FF is inverted to produce the active HIGH >> synchronous reset. For devices that support global sets, you can just >> set all the FFs, feed a LOW into the first FF, and dispense with the >> inverter at the end. It's important to clock this FF chain with the >> same clock used for the FSM, of course. >> >> There are other ways to work around this problem, such as adding extra >> do-nothing states after the initial states in a one-hot, or making >> sure that the FSM won't transition out of the initial state until a >> few cycles after the asynch reset has been released. These work, too. >> The method I've described is easy to do in either schematics or HDL >> and, if desired, allows you to easily synchronize the startup of >> multiple FSMs. >> >> Take care, >> Bob Perlman >> > > >Sent via Deja.com http://www.deja.com/ >Before you buy. ----------------------------------------------------- Bob Perlman Cambrian Design Works Digital Design, Signal Integrity http://www.best.com/~bobperl/cdw.htm Send e-mail replies to best<dot>com, username bobperl -----------------------------------------------------Article: 19439
Just a brief question: How reliable are the timing results which the the M1 P&R tool (on Unix) provides for XC4000 family designs? In particular, how likely is it that the maximum critical path delay can be met in an actual design. Thanks, Christof ! WORKSHOP ON CRYPTOGRAPHIC HARDWARE AND EMBEDDED SYSTEMS (CHES 2000) ! ! WPI, August 17 & 18, 2000 ! ! http://www.ece.wpi.edu/Research/crypt/ches ! *********************************************************************** Christof Paar, Assistant Professor Cryptography and Information Security (CRIS) Group ECE Dept., WPI, 100 Institute Rd., Worcester, MA 01609, USA fon: (508) 831 5061 email: christof@ece.wpi.edu fax: (508) 831 5491 www: http://ee.wpi.edu/People/faculty/cxp.html ***********************************************************************Article: 19440
vdiepeve@cs.uu.nl (Vincent Diepeveen) wrote: > if i start it in a simple way, say in next way: > > instead of writing all the difficult search stuff in FPGA, > i only write the evaluation in FPGA. This can be done in parallel > of course to a big extend. > So it's smart to do it in fpga. I think the search stuff is the easier part. It's basically recursive and decomposable: Assume you represent a board state as a large vector, for example 4 bits can represent the 12 unique pieces, so a 64x4 vector can represent the entire board state. (Probably less). If you create a search module that operates on a single square (probably a lookup table or Altera CAM), then it can work serially on a square at a time. This can probably be implemented as a fairly small state machine. It would be possible to fit many of these on a single chip and have them run in parallel. The recursive structure can be implemented by feeding the results to FIFOs which feed back to the input. Something like this: {= = Evaluator = FIFO } {= Parallel = Evaluator = FIFO } Lookup Feed Start =>= search = Evaluator = FIFO => Merge > Cached > back to {= Modules = Evaluator = FIFO } Path Start {= = Evaluator = FIFO } Or you can arrange the data flow in a tree structure that mimics the search tree. The processing rate is likely to be limited by the data path, but a rate of 12.5MHz per output tree branch seems acheivable (A 64-bit wide bus at 50MHz). If the evaluator is the bottleneck, and we assume an evaluator can be pipelined to process a board state in an average 500ns, then you would need only 6 of these to keep up with the 12.5MHz path. The cache will also be a bottleneck, since to be most effective, it should be shared by all branches. You'd probably want to construct a multiport cache by time sharing it among several branches. A cache cycling at 100 MHz could service 8 branches running at 12.5 MHz. -- Don Husby <husby@fnal.gov> http://www-ese.fnal.gov/people/husby Fermi National Accelerator Lab Phone: 630-840-3668 Batavia, IL 60510 Fax: 630-840-5406Article: 19441
Ray Andraka wrote: > And once you get it, I guess you'll need free PWB layout, fab and > assembly? Most of the modern packaging is not well suited for hobbyist > work - fine pitch quad flat packs and ball grid arrays take special > techniques to mount on the board. for most fpgas in plastic, the cost of the devices is rather cheap. the design software can get a bit pricy, although that is variable (student editions, some company have "lite" versions for free downloads, etc.). my software investment is fairly high, it's worth considerably more than the computer (over an order of magnitude more) and i'll be increasing the amount of software soon (warning to salescritters - this is not an invitation to call! :-). a good quality pcb for these chips is required as most modern devices have rather fast edge rates and/or large numbers of i/o's that can switch with less noise margins - note that many devices can no longer switch from 0V -> 5V with a switching threshold set at approximately 2.5V; perhaps we'll be seeing and using more differential i/o's. in my opinion it's worth the $ to get a good layout by an experienced layout guy and a solid multi-layer board. of course, this, combined with the software, dwarfs the cost of most fpga devices. heck, even the sockets/adapters get relatively expensive, with many of them running over $100 each. for assembly, i still use a fair amount of small fpgas in the plcc84; the sockets for these make good contact and are easy for even me to solder onto a pcb. for bga, i'll be trying out a semi-easy-to-solder socket soon; this is not for a high-speed application. for surface mount of the popular pqfp or bga packages, i would definitely agree, not a job for the hobbyist; a job for the experienced technician with the right experience and equipment. rkArticle: 19442
Joel Kolstad wrote: > rk <stellare@nospam.erols.com> wrote in message > news:385EE12E.F6315E86@nospam.erols.com... > > perhaps another one ... > > > > no self-clearing structures (f-f output to it's clear). find another way to > make > > that pulse! i've seen this too many times! > > I've used a form of this when interfacing an FPGA to a DSP's bus (both run > off different clocks). The DSP's clock registers a write strobe in flip > flop #1 (while the data bus contents are registered in a bunch of other > registers). This is followed by two flip flops, #2 and #3 (#2 acting as a > synchornizer, more or less) clocked by the FPGA's clock. The output of #3 > goes to the FPGA's state machines, etc., and also goes back to the > asynchronous reset input on 'flop #1. (Note the the DSP is slow compared to > the FPGA; the output of 'flop #3 fires long before the DSP gets around to > sending another write strobe.) > > Is this dangerous (doesn't seem to me to be...)? ...and what's a better > alternative? > > Thanks... hi, well, i was really thinking of something else. (at day job) i recently had to interface to a microprocessor running over a bus ... and the bus was run too fast to guarantee that i could clock in the data early in the project ... and in the worst case, a signal from the microprocessor would go to another board and then over to me. in other words, although it was a "synchronous system" i had to treat all incoming signals as asynchronous and sync them. if i understand correctly what you described, i did much the same thing. after synchronization, the pulses (both to my state machines and those clearing out the initial flop) had a width determined by the period of the clock in my chip. now, the evil-asynchronous-logic-nazis saw this and then started to whine that i had a signal going into an asynchronous clear and got all upset based on that fact alone. since i can guarantee that all pulses were well formed independent of any propagation delay, the metastable state resolution time that i gave it was quite generous, and controlled by the frequency of a crystal oscillator, the circuit was good(*), in my opinion. the situation that i was describing above is when the clearing pulse width is determined by gate and routing delays. in the worst case that i have seen, two flip-flops had their outputs NANDed and the output of the NAND went to another sub-circuit and to the clears of both flops <rk takes a minute to heave>. that i consider a dangerous circuit. perhaps i could have worded the original post better, something like "asynchronously self-clearing structure". (*) actually, the system was a disgrace and the circuit a local patch to make for poor, or more correctly, non-existent system design. no one knows why the bus was run at high speed as there was nothing high speed going on and we were shooting for absolute minimum power. the contractor who did this stragetically placed pads for R's and C's throughout the entire system to tune things up and get the clock edges in the "right place." <i feel sick again just typing this>. i again note that this was at day job! have a good evening, ---------------------------------------------------------------------- rk The world of space holds vast promise stellar engineering, ltd. for the service of man, and it is a stellare@erols.com.NOSPAM world we have only begun to explore. Hi-Rel Digital Systems Design -- James E. Webb, 1968Article: 19443
Bob Perlman writes: > I thought I'd mention one place where speed increases cause problems > in synchronous systems: hold times. > What happens if, say, > you've got an FPGA from a -X speed bin driving another FPGA from a -X > speed bin, and the first FPGA is actually a much faster part that's > been restamped -X because the vendor had an over-abundance of fast > parts? Maybe you have a hold time problem, and maybe you don't. I've been designing for the Altera Flex10KE devices. They specify a /minimum/ external clock-to-output delay of 2ns, and an external hold time of 0ns. So you have up to 2ns of skew to play with between devices. Cool huh? We need it :-) -- JamieArticle: 19444
A better method is to toggle a flag flip flop each time you write the DSP register (ie you have one extra bit on the register which is loaded with its inverted output). Then take that flag, synchronize it to the FPGA domain, and then use a change in state to generate a write pulse in the FPGA clock domain. You can minimize the latency hit if you design the write pulse state machine (gray code 4 states) so that the flag input is only sensed by one flip-flop. The way you are doing it, can get you into trouble if the DSP comes in and sets the flop near the time you do the reset. This one works as long as the FPGA is a little faster than the DSP (the smaller the differential, the less margin you have though) valid: process( GSR, clk) variable state:std_logic_vector(1 downto 0); variable sync:std_logic; begin if GSR='1' then sync:='0'; state:="00"; elsif clk'event and clk='1' then sync:=toggle; case state is when "00" => if sync='1' then state:="01"; else state:="00"; end if; wp<='0'; when "01" => state:="11"; wp<='1'; when "11" => if sync='0' then state:="10"; else state:="11"; end if; wp<='0'; when "10" => state:="00"; wp<='1'; when others=> null; end case; end if; end process; Joel Kolstad wrote: > rk <stellare@nospam.erols.com> wrote in message > news:385EE12E.F6315E86@nospam.erols.com... > > perhaps another one ... > > > > no self-clearing structures (f-f output to it's clear). find another way > to make > > that pulse! i've seen this too many times! > > I've used a form of this when interfacing an FPGA to a DSP's bus (both run > off different clocks). The DSP's clock registers a write strobe in flip > flop #1 (while the data bus contents are registered in a bunch of other > registers). This is followed by two flip flops, #2 and #3 (#2 acting as a > synchornizer, more or less) clocked by the FPGA's clock. The output of #3 > goes to the FPGA's state machines, etc., and also goes back to the > asynchronous reset input on 'flop #1. (Note the the DSP is slow compared to > the FPGA; the output of 'flop #3 fires long before the DSP gets around to > sending another write strobe.) > > Is this dangerous (doesn't seem to me to be...)? ...and what's a better > alternative? > > Thanks... > > ---Joel Kolstad -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 19445
Christof Paar wrote: > Just a brief question: How reliable are the timing results which the the > M1 P&R tool (on Unix) provides for XC4000 family designs? In particular, > how likely is it that the maximum critical path delay can be met in an > actual design. I'm not quite sure what you are asking: are you asking "if the Xilinx toolset reports 15 ns maximum for a path, how likely is that path going to be more than 15 ns in a real part"? Answer is: unlikely. Xilinx seems to do a good job. Aiding this is that the XC4000 parts are not cutting edge, they have been in use for years. I have had fairly good experience with XC4000 family parts. Or are you asking for real quality statistics from a list of major users of Xilinx parts? Or are you asking "the Xilinx toolset reports 15.2 ns maximum and I want to run it at 15 ns period, what are my odds of it working? Answer is: not as likely as the first, but still pretty good. Usually parts at nominal conditions (temperature and voltage) are faster than the xdelay number. How much faster? How lucky do you feel? I do not suggest this for any type of critical application: life support, machine controls, or anything else that might cause harm if the device malfunctions. Or are you asking "the Xilinx toolset reports 15 ns, and my circuit will not work if the delay is faster than 7 ns"? The answer to this is that your circuit is likely not to work sometime. It might work when you are checking it out, but failure when showing it to someone important is guaranteed by one variant of Murphy's Law. Never count on minimum delays. Keep your clock skew as low as possible. Never draw to an inside straight. -- Phil Hays "Irritatingly, science claims to set limits on what we can do, even in principle." Carl SaganArticle: 19446
Rich. I'm surprised you've had good luck with the PLCC 84 sockets and FPGAs. I've had some very bad experiences with those. THey seem OK the first time you put a chip in. Remove and replace the chip once or twice and let the games begin. rk wrote: > Ray Andraka wrote: > > > And once you get it, I guess you'll need free PWB layout, fab and > > assembly? Most of the modern packaging is not well suited for hobbyist > > work - fine pitch quad flat packs and ball grid arrays take special > > techniques to mount on the board. > > for most fpgas in plastic, the cost of the devices is rather cheap. the > design software can get a bit pricy, although that is variable (student > editions, some company have "lite" versions for free downloads, etc.). my > software investment is fairly high, it's worth considerably more than the > computer (over an order of magnitude more) and i'll be increasing the amount > of software soon (warning to salescritters - this is not an invitation to > call! :-). > > a good quality pcb for these chips is required as most modern devices have > rather fast edge rates and/or large numbers of i/o's that can switch with > less noise margins - note that many devices can no longer switch from 0V -> > 5V with a switching threshold set at approximately 2.5V; perhaps we'll be > seeing and using more differential i/o's. in my opinion it's worth the $ to > get a good layout by an experienced layout guy and a solid multi-layer > board. of course, this, combined with the software, dwarfs the cost of most > fpga devices. heck, even the sockets/adapters get relatively expensive, with > many of them running over $100 each. > > for assembly, i still use a fair amount of small fpgas in the plcc84; the > sockets for these make good contact and are easy for even me to solder onto a > pcb. for bga, i'll be trying out a semi-easy-to-solder socket soon; this is > not for a high-speed application. > > for surface mount of the popular pqfp or bga packages, i would definitely > agree, not a job for the hobbyist; a job for the experienced technician with > the right experience and equipment. > > rk -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 19447
The output from the timing analyzer is real. The summary results from PAR are ballpark, but they do usually seem to err on the conservative side. Christof Paar wrote: > Just a brief question: How reliable are the timing results which the the > M1 P&R tool (on Unix) provides for XC4000 family designs? In particular, > how likely is it that the maximum critical path delay can be met in an > actual design. > > Thanks, > > Christof > > ! WORKSHOP ON CRYPTOGRAPHIC HARDWARE AND EMBEDDED SYSTEMS (CHES 2000) ! > ! WPI, August 17 & 18, 2000 ! > ! http://www.ece.wpi.edu/Research/crypt/ches ! > > *********************************************************************** > Christof Paar, Assistant Professor > Cryptography and Information Security (CRIS) Group > ECE Dept., WPI, 100 Institute Rd., Worcester, MA 01609, USA > fon: (508) 831 5061 email: christof@ece.wpi.edu > fax: (508) 831 5491 www: http://ee.wpi.edu/People/faculty/cxp.html > *********************************************************************** -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randrakaArticle: 19448
In article <385ED38E.24214D76@magtech.com.au>, mikes@magtech.com.au (Michael Stanton) wrote: > Hi Thomas > > We have never had to alter any lines inside the Jam source file and > have always > been able to use the .jam file produced by Max+Plus II. > > The following is the DOS command line we use to program a FLEX 10K30A > as part of > a three device JTAG chain : > > jam -v -dDO_CONFIGURE=1 -p378 cpld_top.jam > > We are using Jam.exe ver 1.2 and Max+Plus II 9.3 and have a > ByteBlasterMV > connected to a standard PC printer port (LPT1 at 378h) via a 2m long > D25M-D25F > extension cable. > > There are two versions of the jam.exe ; 16-bit-DOS and Win95-WinNT. > Have you > tried each version ? A while back, the Jam player had some nasty 16-bit limitations which prevented the DOS version from programming most 10Ks. That may have changed now, but the solution then was not to use it! Altera publicly promised to fix it, but privately said don't hold your breath. For FLEX10K devices, Jam is a useless waste of effort. The bit stream is just clocked in with no fancy algorithm, and a simple program that reads the .SOF file[s] is orders of magnitude faster and smaller. -- Steve Rencontre, Design Consultant http://www.rsn-tech.demon.co.ukArticle: 19449
The timing numbers from the timing analyzer are real for worst case delays. The numbers reported in the PAR summary are ballpark, but they seem to generally be conservative. Run timing analyzer and use those times as the worst case numbers, as they are more accurate. Keep in mind that the numbers reported are worst case delays over temperature, voltage and process. Realistically, you won't see those delays. Use them as a bounding box, and your design will be fine. If you feel lucky and you know (for instance) that the temperature will always be 70F, you can push them beyond. How much? who knows. You could characterize your part to find out *for that part* if so inclined. At nominal voltage and room temperature you can see delays that are less than half of the worst case delays. Also, be aware that slower speed grades may be faster parts marked as slow parts. All this goes to further the argument against depending on logic delays to make the design work. Christof Paar wrote: > Just a brief question: How reliable are the timing results which the the > M1 P&R tool (on Unix) provides for XC4000 family designs? In particular, > how likely is it that the maximum critical path delay can be met in an > actual design. > > Thanks, > > Christof > > ! WORKSHOP ON CRYPTOGRAPHIC HARDWARE AND EMBEDDED SYSTEMS (CHES 2000) ! > ! WPI, August 17 & 18, 2000 ! > ! http://www.ece.wpi.edu/Research/crypt/ches ! > > *********************************************************************** > Christof Paar, Assistant Professor > Cryptography and Information Security (CRIS) Group > ECE Dept., WPI, 100 Institute Rd., Worcester, MA 01609, USA > fon: (508) 831 5061 email: christof@ece.wpi.edu > fax: (508) 831 5491 www: http://ee.wpi.edu/People/faculty/cxp.html > *********************************************************************** -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email randraka@ids.net http://users.ids.net/~randraka
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z