Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On Jun 25, 6:05 pm, John Williams <jwilli...@itee.uq.edu.au> wrote: > Regardless of the FPGA you choose, or the implementation of your main > processing loop, don't even start until you've done a thorough IO / > memory bandwidth analysis. Even in 2007 we're still seeing papers > saying "we got 20X speed up in the core, then when we put it on the > memory bus we got 1.5X". In many respects that is still putting the cart before the horse. One needs to step back, divine the global architecture of the application, and make some determinations of if this is even a practical problem for an FPGA, or should it maybe be moved to a more optimal CPU/Cache/Memory architecture. For a large data set, memory bandwidth isn't going to be substantially different, unless the algorithm can be twisted to reduce memory bandwidth by a different processing ordering or local caching. This is one area where a CPU/ Cache/Memory architecture may well simply smoke any possible FPGA design. For very large data sets, frequently processors like Itanium 2 with 12MB and larger caches will smoke a typical PC or FPGA implementation by simply getting rid of significant portions of the raw memory bandwidth into faster Caches. One side effect of this is that the existing source code for the application may have already been heavily optimized to squeeze every possible memory cycle out of the problem. The resulting code is probably larger, and possible way overly complex for a starting point for any FPGA design. One might well have to reverse engineer the original simpler algorithms first, in hopes that there is indeed an embarassingly parallel FPGA solution that avoids the raw memory bandwidth. Other cases of interest are choices in data path sizes ... they might all be 64 bit, simply because it has been optimized for a high end HPC engine that was 64bit native. After stepping back an looking at the problem very closely from an architecture and requirements perspective, sometimes insightes emerge that the problem doesn't need all that significance and dynamic range everywhere .... allowing the FPGA implemention to be partitioned to match the real problem needs, not what had formally been done on the prior solution. Once architecture, data and algorithm issues are well understood, and we have a fair idea what the processing kernel must do, then clearly looking at matching interface designs is not only required, but finally practical too. Designing interfaces before the architecture and processing kernels are understood, is doing work toward an undefined requirement. Performance by Design JohnArticle: 121301
Note, that as long as there is no other logic inside the CPLD, the result for an individual chip should be very predictable. You can measure and correct the nonlinearity. If there is a big difference from chip to chip you need to calibrate the correction for each chip. There might be a significant dependance on tempereture. In that case you would have to ovenize the thing which is probably more expensive than the discrete solution. I am sure that extremly fast ECL logic works well, but I am not sure how important that is. Maybe CMOS single gate logic with a very clean power supply is good enough. In that case the logic would be very small and even less expensive than the CPLD. Kolja Sulimma On Jun 29, 1:44 pm, "Ulrich Bangert" <d...@ulrich-bangert.de> wrote: > Kolja, > > I needed only a vague look to your website to see that you surely know what > you are talking about. Yes, clearly an TDC can be used for that purpose. My > efforts to do it the linear way have the background to make an very cheap > technology available that would enable say radio amateurs to characterize > precise oscillators. In terms of price the modern cplds come very handy. > I will think about your suggestions. > > Best regards > Ulrich Bangert > > Best regards > Ulrich Bangert > "comp.arch.fpga" <ksuli...@googlemail.com> schrieb im Newsbeitragnews:1183106805.864519.234350@q75g2000hsh.googlegroups.com... > > > On Jun 28, 1:00 pm, Jim Granville <no.s...@designtools.maps.co.nz> > > wrote: > > > > So, do not use a CPLD as the Phase Comparitor, but use an LVC Exor gate, > > > preferable with Schmitt. eg LVC1G97, with analog Supply decoupling. > > > Actually, this might not be sufficient. The clock-to-out delay of the > > flip-flops in the CPLD and > > of the output drivers of the CPLD depend on the supply voltage. The > > supply voltage inside > > the chip depends on the switching history of all nearby signals in the > > chip. > > (E.G. the carry chain delay of a Spartan-2 doubles during an interval > > of a few hundred picoseconds > > after the flip-flops connected to it were switching.) > > > Similar is true for the input thresholds of the flip flop clock > > inputs. At the precisions you are talking about > > you will see a lot of crosstalk from different sources. > > > It surely will help to use discrete fast flip-flops instead of a CPLD > > (We use PECL devices from OnSemi). > > At least you don't see package ground bounce across packages. > > > Another option is to measure the positions of the edges individually > > and compute the phase difference from that. Many Time To Digital > > Converters have two or more channels. See our websitewww.cronologic.de > > for an example. > > > Kolja SulimmaArticle: 121302
In the xilinx programmer diagram (jtag/parallel download cable) (0380507) there is note number 2 which tells us that d6 busy and pe are connected at the db25 end of data cable. What does that mean? where is the db25 end? What will happen if I will connect it on the pcb instead? Thanks in advance p.s. Anyone knows what is the purpose of d1 and d2 and c1 to c4 and r9 to r12 and r3 to r7 and r1/r2/r14/r8/r13/c5 Basically, it would be nice to have a list of all the components along with their "rason de etre" (feel free not to answer this p.s.)Article: 121303
The db25 end is the end that connects to the parallel port on your computer (because the connector is shaped like a D and has 25 pins/holes). There most likely isn't even an easy way for you to connect it to your board, but if you could what would happen would largely depend what was connected to the other end. ---Matthew Hicks > In the xilinx programmer diagram (jtag/parallel download cable) > (0380507) there is note number 2 which tells us that d6 busy > and pe are connected at the db25 end of data cable. > What does that mean? where is the db25 end? > What will happen if I will connect it on the pcb instead? > Thanks in advance > p.s. > Anyone knows what is the purpose of d1 and d2 and > c1 to c4 and > r9 to r12 and > r3 to r7 and > r1/r2/r14/r8/r13/c5 > Basically, it would be nice to have a list of all the components along > with > their "rason de etre" > (feel free not to answer this p.s.)Article: 121304
zlatkopetrov@yahoo.com wrote: > Does somebody know how to execute (pass) several commands inside xps? > > Now I invoke xps from a batch file through > xps -nw project_dir\system.xmp > > Then I would like to execute several commands and to exit using > commands passed through script file. > > Is there a way to do, the xilinx docs are not clear for this questions > or at least I have not found the right tutorial. This should work: xps -nw project_dir/system.xmp < script_file where script_file contains the commands. You may need "exit" as the last command, just try it and see. Regards, JohnArticle: 121305
There is one db25 on the pcb, and one is the cable, so what is the difference were we stick these connections from note 2Article: 121306
On Jun 30, 9:03 pm, "cpope" <cep...@nc.rr.com> wrote: > I have an SOC design built in EDK 8.2.03 for a v4fx12. The fpga boots from > an xcf08p serial prom. I have an intermittent problem that seems to come and > go with every rebuild. What happens is the chip will configure from prom > okay but the software doesn't run. I can tell the chip is fully configure > because my debug led lights and the current jumps to full. The design always > runs fine when I load the fpga over jtag so something is squirelly with the > boot from prom. Also, when it misconfigures the PPC core doesn't appear on > the jtag scan path with xmd so I can't even load the software manually. > > Here is my bitgen options: > -g ConfigRate:4 > -g CclkPin:PULLUP > -g TdoPin:PULLNONE > -g M1Pin:PULLNONE > -g DonePin:PULLUP > -g DriveDone:Yes > -g StartUpClk:JTAGCLK > -g DONE_cycle:4 > -g GTS_cycle:5 > -g M0Pin:PULLNONE > -g M2Pin:PULLNONE > -g ProgPin:PULLUP > -g TckPin:PULLUP > -g TdiPin:PULLUP > -g TmsPin:PULLUP > -g DonePipe:No > -g GWE_cycle:6 > -g LCK_cycle:NoWait > -g Security:NONE > -m > -g Persist:No > > Any help is appreciated, > Clark We had a similar issue with a V2 pro system last year and the issue was related to timing, the DONE light would always signal completion, but the memory test built into the BRAM to run on the PPC would fail. The part that was spec'd to be on the board was a -6 but the item was changed on the BOM to a -5 (b/c of availability) and no one told the firmware / software team, so we were meeting all timing according to XPS / ISE, but that was for the wrong part. You might want to verify that everything has been specified correctly for the system you are targeting. GregArticle: 121307
"Greg Crocker" <greg.crocker@gmail.com> wrote in message news:1183337586.755248.18570@n60g2000hse.googlegroups.com... > On Jun 30, 9:03 pm, "cpope" <cep...@nc.rr.com> wrote: > > I have an SOC design built in EDK 8.2.03 for a v4fx12. The fpga boots from > > an xcf08p serial prom. I have an intermittent problem that seems to come and > > go with every rebuild. What happens is the chip will configure from prom > > okay but the software doesn't run. I can tell the chip is fully configure > > because my debug led lights and the current jumps to full. The design always > > runs fine when I load the fpga over jtag so something is squirelly with the > > boot from prom. Also, when it misconfigures the PPC core doesn't appear on > > the jtag scan path with xmd so I can't even load the software manually. > > > > Here is my bitgen options: > > -g ConfigRate:4 > > -g CclkPin:PULLUP > > -g TdoPin:PULLNONE > > -g M1Pin:PULLNONE > > -g DonePin:PULLUP > > -g DriveDone:Yes > > -g StartUpClk:JTAGCLK > > -g DONE_cycle:4 > > -g GTS_cycle:5 > > -g M0Pin:PULLNONE > > -g M2Pin:PULLNONE > > -g ProgPin:PULLUP > > -g TckPin:PULLUP > > -g TdiPin:PULLUP > > -g TmsPin:PULLUP > > -g DonePipe:No > > -g GWE_cycle:6 > > -g LCK_cycle:NoWait > > -g Security:NONE > > -m > > -g Persist:No > > > > Any help is appreciated, > > Clark > > We had a similar issue with a V2 pro system last year and the issue > was related to timing, the DONE light would always signal completion, > but the memory test built into the BRAM to run on the PPC would fail. > The part that was spec'd to be on the board was a -6 but the item was > changed on the BOM to a -5 (b/c of availability) and no one told the > firmware / software team, so we were meeting all timing according to > XPS / ISE, but that was for the wrong part. You might want to verify > that everything has been specified correctly for the system you are > targeting. > > Greg > Don't think this is it because (a) this had been booting fine for several months on the same hardware (b) I'm already using the slowest speed grade (-10) but thanks for the lead. -ClarkArticle: 121308
Is it possible to use the sma connector on the s3a or s3e boards as a signal output not just a clock output ? I haven't properly used a dcm yet (only a few very basic designs) so I'm not sure if this is possible or not. Trying to find a cheap way of doing a software defined radio in an fpga. Thank you AlexArticle: 121309
Hello, I would like to do a 32bit multiplication. The result must be stored in a "64bitregister". I did that : Xuint32 test1=0xFFFFFFFF; Xuint32 test2=0xBBBBBBBB; Xuint64 *res64; res64 ->Upper = (test1 * test2) >> 32; res64 ->Lower = test1 * test2; But it didn't work ! Do you have an idea how to do that ? Regards, Laurent.Article: 121310
LilacSkin wrote: > Hello, > > I would like to do a 32bit multiplication. > The result must be stored in a "64bitregister". > > I did that : > > Xuint32 test1=0xFFFFFFFF; > Xuint32 test2=0xBBBBBBBB; > Xuint64 *res64; > res64 ->Upper = (test1 * test2) >> 32; > res64 ->Lower = test1 * test2; > > But it didn't work ! > Do you have an idea how to do that ? > > Regards, > Laurent. > What about Xuint32 test1=0xFFFFFFFF; Xuint32 test2=0xBBBBBBBB; Xuint64 res64; res64 = mul32by32(test1, test2);Article: 121311
On Jul 2, 3:37 am, LilacSkin <lpaul...@iseb.fr> wrote: > Hello, > > I would like to do a 32bit multiplication. > The result must be stored in a "64bitregister". > > I did that : > > Xuint32 test1=0xFFFFFFFF; > Xuint32 test2=0xBBBBBBBB; > Xuint64 *res64; > res64 ->Upper = (test1 * test2) >> 32; > res64 ->Lower = test1 * test2; > > But it didn't work ! > Do you have an idea how to do that ? > > Regards, > Laurent. You need to type cast test1 and test2 to be 64 bits before you multiply them together. Try: *res64 = ((Xuint64)test1) * ((Xuint64)test2); Regards, John McCaskill www.fastertechnology.comArticle: 121312
On Sat, 30 Jun 2007 21:03:50 -0400, "cpope" <cepope@nc.rr.com> wrote: >I have an SOC design built in EDK 8.2.03 for a v4fx12. The fpga boots from >an xcf08p serial prom. I have an intermittent problem that seems to come and >go with every rebuild. What happens is the chip will configure from prom >okay but the software doesn't run. I can tell the chip is fully configure >because my debug led lights and the current jumps to full. The design always >runs fine when I load the fpga over jtag so something is squirelly with the >boot from prom. Also, when it misconfigures the PPC core doesn't appear on >the jtag scan path with xmd so I can't even load the software manually. > >Here is my bitgen options: >-g ProgPin:PULLUP With the PPC in the Virtex-IIPro, the advice is to _pulse_ the Prog pin at the appropriate time (Impact does this for me) - that was after JTAG config. I can't recall hearing a satisfactory explanation why; but it DID give an utterly reliable boot after a 10-20% success rate previously. Perhaps the same is true of the PPC in the V4? Searching Xilinx support for "pulse PROG pin" may yield more info - BrianArticle: 121313
> You need to type cast test1 and test2 to be 64 bits before you > multiply them together. > > Try: > > *res64 = ((Xuint64)test1) * ((Xuint64)test2); > Nope ... that would work if xilinx used the "unsigned long long" type for their 64 bits type ... But they didn't ... (don't ask me why they did that ...) typedef struct { Xuint32 Upper; Xuint32 Lower; } Xuint64; SylvainArticle: 121314
On Jul 2, 6:49 am, Sylvain Munaut <tnt-at-246tNt- dot-...@youknowwhattodo.com> wrote: > > You need to type cast test1 and test2 to be 64 bits before you > > multiply them together. > > > Try: > > > *res64 = ((Xuint64)test1) * ((Xuint64)test2); > > Nope ... that would work if xilinx used the "unsigned long long" type for their 64 bits type ... > But they didn't ... (don't ask me why they did that ...) > > typedef struct > { > Xuint32 Upper; > Xuint32 Lower; > > } Xuint64; > > Sylvain My mistake then, I boot into Linux and use the u_int64_t or the unsigned long long as you mention. A bit more type casting would get the OP there I think. Or just use the routine you mentioned. Regards, John McCaskill www.fastertechnology.comArticle: 121315
Does Xilinx ISE benefit from Multi CPU setups? Like offered by AMD Athlon64 X2, AMD Opteron, Intel Core2Duo etc..? Also would AMD AM2 socket + 800MHz DDR2 be really benefitial compared to non DDR2 motherboards?Article: 121316
About the parallel port jtag programmer, can anyone say what is the reason we require resistors at the inputs to the buffers. I thought buffers have high input resistance anyway?Article: 121317
pbFJKD@ludd.invalid wrote: >Does Xilinx ISE benefit from Multi CPU setups? >Like offered by AMD Athlon64 X2, AMD Opteron, Intel Core2Duo etc..? >Also would AMD AM2 socket + 800MHz DDR2 be really benefitial compared to >non DDR2 motherboards? Curious on the linux version of ise.Article: 121318
On Jul 2, 7:22 am, <darrick> wrote: > About the parallel port jtag programmer, > can anyone say what is the reason we require > resistors at the inputs to the buffers. > I thought buffers have high input resistance anyway? Many CMOS devices have protection diodes between the inputs and the power supplies. That means that beyond a diode drop outside the supply voltage, the inputs are no longer high impedance, since the diodes start to conduct. The resisters are probably there to limit the diode current in this situation. This is especially true if the receiving logic device has a lower supply voltage than the sending device may drive - in that case, the external resistor and the internal diode function as a level shifter. You can do the same trick to put 5v signals into some FPGA families (spartan 3 for example), the resistor needing to be sized to keep the maximum current through the protection diode within its data sheet limit.Article: 121319
It is for my parameter update. A=N*Lamda, N is iternation number from 1 to 5, and the Lamda is a number in 'std_logic_vector(15 downto 0)'. A1=1*lamda,A2=2*lamda,A3=3*Lamda,....The multiplier deals with the number with the type 'signed'/'unsigned'. So i change the 2 parameter to the std_logic_vector. That is why i use 65535 * 15. I want to know if i only need 16bits for the output, how to do it. thanks. John_H wrote: > ZHI wrote: > > I have a problem about the data width of the output of multiplier. > > I am using the coregen Multiplier in Xilinx. > > a : std_logic_vector(3 downto 0); > > b : std_logic_vector(15 downto 0); > > q: std_logic_vector(15 downto 0); > > > > I simulated the result by using Modelsim. I found 'q' cannot get > > correct result unless I change the width of 'q' to 19. > > > > I want to know if it is possible to get 16 bit accurate result. > > Because i need 'q' to connect to another input port with 16 bits > > width. > > > > One more thing, Is the modelsim can simulate the delay time for the > > Coregen Multiplier. I have pre-define the parameters of the Multiplier > > to make the latency is zero. Why I see the result of 'q' is shown with > > a long delay after the input 'a' 'b' are written into the port. > > > > Any comment about the multiplier is appreciated. Thank you. > > > What's 65535 * 15 ?Article: 121320
Thanks John ! John Williams : > zlatkopetrov@yahoo.com wrote: > > Does somebody know how to execute (pass) several commands inside xps? > > > > Now I invoke xps from a batch file through > > xps -nw project_dir\system.xmp > > > > Then I would like to execute several commands and to exit using > > commands passed through script file. > > > > Is there a way to do, the xilinx docs are not clear for this questions > > or at least I have not found the right tutorial. > > This should work: > > xps -nw project_dir/system.xmp < script_file > > where script_file contains the commands. You may need "exit" as the > last command, just try it and see. > > Regards, > > JohnArticle: 121321
"ZHI" <threeinchnail@gmail.com> wrote in message news:1183386642.801925.162180@q69g2000hsb.googlegroups.com... > It is for my parameter update. A=N*Lamda, N is iternation number from > 1 to 5, and the Lamda is a number in 'std_logic_vector(15 downto 0)'. > A1=1*lamda,A2=2*lamda,A3=3*Lamda,....The multiplier deals with the > number with the type 'signed'/'unsigned'. So i change the 2 parameter > to the std_logic_vector. > That is why i use 65535 * 15. > I want to know if i only need 16bits for the output, how to do it. > thanks. I asked you the leading question and you gave me background, not an answer. 16'hffff * 4'hf = 20'hefff1 (16'd65535 * 4'd15 = 20'd983025) The 20-bit value efff1 cannot be represented in 16 bits. The question you need to ask yourself is what accuracy you need in your parameter value that you're trying to restrict to 16 bits. Perhaps you can get by with 9 bits. It's ENTIRELY dependent on your needs and NOT something we can give you advice on without knowing the details of your needs. The details you gave only give a glimpse at issues that are probably too detailed to communicate effectively online. Are you familiar with the concepts of Error Analysis and Quantization Error? - John_HArticle: 121322
The documentation for the pinout is here: http://direct.xilinx.com/bvdocs/publications/ds097.pdf <darrick> wrote in message news:46863634$1_2@mk-nntp-2.news.uk.tiscali.com... >I would like to ask some questions regarding a Xilinx JTAG programmer: > > First, it seems that the programmer doesn't actually connect to the LPT > port because of gender mismatch. Luckily I have a parallel port gender > changer. Is this still ok? > > Second point, I connect the programmer and start up xilinx ise and impact. > I get a message that many unknown devices are being detected, is this > normal? > > Last point, the download cable seems to be some sort of 2 x 8 block > socket, i.e. 16 pins, how do I identify the required pins i.e. > vdd,gnd,tdi,tms,tck,tdo? > > Thanks in advance > >Article: 121323
ZHI wrote: > I want to know if i only need 16bits for the output, how to do it. > thanks. You are dealing with binary fractions, not integers, right? That is, numbers between 0 and 1 (or -1 to 1)? In your case, just take the leftmost 16 bits of the 20 bit result. -JeffArticle: 121324
Hi, For the Spartan-3A Starter Kit, you can use it as an input or output, but I would note that it's single-ended, using the system ground as the shield/reference. Eric "Alex Gibson" <news@alxx.org> wrote in message news:5erd29F39rc71U1@mid.individual.net... > Is it possible to use the sma connector on the s3a or s3e boards as a > signal output not just a clock output ? > > I haven't properly used a dcm yet (only a few very basic designs) > so I'm not sure if this is possible or not. > > Trying to find a cheap way of doing a software defined radio in an fpga. > > Thank you > > Alex >
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z