Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
hi, for PIO mode, the DIOR (DDMARDY?) is the data strobe. 20ns setup, 5ns hold time is guaranteed by device. At the host side, if you use spatan or virtex, I believe xilinx FPGA does not required hold time and you have plenty of setup time already. For ultra DMA mode, its more tricky. Data is strobed by both edges of DSTROBE (DMA in) I believe ATA-6 spec do not recommend to strobe data directly with DSTROBE. Instead using another clock "synchronized version of DSTROBE" (thats what they said : ) to strobe data during DMA in. You may look at 6 asynchronous circuits by Petter at xilinx web site for reference, good luckArticle: 63651
In article <bpum8f$1srj32$1@ID-212430.news.uni-berlin.de>, valentinNOMORESPAM@abelectron.com mentioned... > UART is used to transfer a byte in serial form bit-by-bit. I know that 10% > deriviations in frequencies of transmitter and receiver are permissible. I > was learnt that UARTs synchronyze at the falling edge (1to0) of start bit; > hence, there should allow for transfer of a stream of bytes of arbitrary > length. > > I have developed a simple UART. It's receiver and transimtter run at 9600 > bps with 16x oversampling. Both receiver and transmitter have 1-byte buffer. > To test the design I've created an echo device; it merely mirrors all the > bytes sent to it back to the sender. It works fine with one of COM ports on > my PC. Another COM port has its crystal running at a bit faster fundamental > frequency. This causes a problem when it sends a long stream of bytes to my > UART. In fact, sender and recepient cannot synchronize at falling edge of > start bit because one of them is slower and is processing a previous byte > wrile sender proceeds to next byte transmitting start bit. Despite of the > fact, my receiver still works fine beacuse it is ready to receive next byte > right after a first half of stop bit is received. Just to clarify, receiver > acquares values from serial input at the middle of each data bit slice; it > reports BYTE_READY in the middle of stop bit and from this moment is ready > to accept next byte, i.e. ready fror synchronization. Therefore, if data is > coming slightly faster and falling edge of start bit is located within stop > bit (according to my UART's clock) receiver is still capable not to overlook > the data. > On the other hand, transmitter should transmit all 10 bits (start + 8 data + > stop) @ 9600 bps. Consider for instance an UART forwarder or an echo device. > If data is coming faster than I forward it I get a buffer overrun > ultimately. That is, receiver is ready with a byte in its buffer to be > copied into transmitter to forward but slow transmitter is still shifting > data out and its buffer is blocked. > I have a "fast" solution for my UART echo device; if transmitter has > transmitted > half of stop bit and sences that there is a next byte received > it stops sending current stop bit and starts transmitting a start bit for > next byte. Untimely ceasing transmission is not good solution because > transmitter may be connected to a good matched or slightly slower running > UART. Design may be not a forwarder thus data provider may differ from 9600 > bps receiver. In this case, starting early transmission of next byte while > remote peer is still receiving stop bit causes stop bit error. > > What is interesting in this situation is the fact I can build a good echo > device from any industrial manufactured UART (I've used standalone 16c750 > and ones built into i8051). They never have a buffer overrun issue despite > sending port is slightly faster than receiving (like sending data from my > fast COM port to slow one). Note, no flow control is used, buffers are > always 1-byte long. Which trick do they use? Again, 10% frequency > dereviations between sender and receiver are considered permittable and no > flow control is not required since sender and receiver both run at formal > 9600bps. > > I feel this should be a well-known problem (solution) and just wonder why I > did not encounter this consideration before. > > Thanks. UARTs are almost always crystal controlled, so the tolerance is less than 1 percent, and there shouldn't be a problem with this. When a serial data stream is run through a multiplexer, where the sampling rate is less than 10x per bit, then you might see such large deviations ("dereviations"). -- @@F@r@o@m@@O@r@a@n@g@e@@C@o@u@n@t@y@,@@C@a@l@,@@w@h@e@r@e@@ ###Got a Question about ELECTRONICS? Check HERE First:### http://users.pandora.be/educypedia/electronics/databank.htm My email address is whitelisted. *All* email sent to it goes directly to the trash unless you add NOSPAM in the Subject: line with other stuff. alondra101 <at> hotmail.com Don't be ripped off by the big book dealers. Go to the URL that will give you a choice and save you money(up to half). http://www.everybookstore.com You'll be glad you did! Just when you thought you had all this figured out, the gov't changed it: http://physics.nist.gov/cuu/Units/binary.html @@t@h@e@@a@f@f@l@u@e@n@t@@m@e@e@t@@t@h@e@@E@f@f@l@u@e@n@t@@Article: 63652
Hello All, I am having problems running the modular design flow in Xilinx ISE 6.1. Everything seems OK until I configure (Spartan2 via Boundary Scan /Parallel using Impact) I get a programming failed message - the done pin failed to go high. I have tried a few different options in Bitgen with no success. Also, the Xilinx ISE 'development systems reference guide' and the Xilinx answers database seem to conflict on the correct use of ngo/ngd files in the modular flow. i.e. Development Systems Reference Guide, Modular Design Section, Top of Page 93 (PDF version) “Note: ngdbuild produces two files, design_name.ngd and design_name.ngo. The design_name.ngo file is used during subsequent Modular Design steps, while design_name.ngd is not.” BUT......... Xilinx Answer Record # 17058 6.1i Modular Design (http://support.xilinx.com/xlnx/xil_ans_display.jsp?iLanguageID=1&iCountryID=1&getPagePath=17058) “An NGO file is produced only if an EDIF netlist is used as the input to NGDBuild. The NGC netlist created by XST is already in Xilinx Database Format, and does not need to be converted to an NGO file.” So could the fact a am using the NGC netlist produced by ngdbuild instead of the ngo file prescribed by the Development Systems Ref Guide be causing this problem?? Please Help (Its nearly Christmas!) Ian Colwill, University Of Sussex, UKArticle: 63653
It is a student's task. Commertial devices are given as an example; however, any interesting topic can be chosen.Article: 63654
If you are looking for "coprocessing" types of connectivity, AMD Opterons with Hypertransport are one possibility. FPGAs can directly attach to an Opteron via Hypertransport--FPGAs cannot directly connect to an Intel CPU Front Side Bus (FSB). In addition, all the external transactions flow through the Intel FSB whereas the onboard Opteron Memory Controller will service a substantial amount of load--leaving only the IO traffic for the hypertransport links. Paul walala wrote: > > Dear all, > > Is PCI the only convinient interfacing unit that talks with CPU by inserting > something into a computer conviniently? What is the speed of that? Is there > any faster method? > > Thanks a lot, > > -WalalaArticle: 63655
hey, i have to use a DCM as i need multiple clocks now the problem is that they should be de-asserted (not active) before some signal, so i need some CE signal. i tried to solve it like this: ddr_clkx2 <= ddr_clkx2_out and locked; were locked is the lock signal from the DCM and the ddr_clkx2_out is a clock output from the DCM and it should be 0 as long as locked is 0. but when i use this trick it gives me the following warning when making the bitfile (with bitgen): WARNING:DesignRules:372 - Netcheck: Gated clock. Clock net _n0045 is sourced by a combinatorial pin. This is not good design practice. Use the CE pin to control the loading of data into the flip-flop. so how should i implement it and what they mean with CE pin? well i know what they mean but how should i implement it in VHDL for a VIRTEXII? thanx in advance, kind regards, YttriumArticle: 63656
In article <3FC541C5.D7EB206B@xilinx.com>, Peter Alfke <peter@xilinx.com> wrote: >Agreed, and I am perhaps the strongest advocate of this point of view >inside Xilinx. But Sales and Marketing are always clamoring for highest >speed and lowest price. "That's what the customers want and need"... And there is the nasty reality that SRAM burns power, and there is a LOT of config bits for each LUT, and that Reprogrammable interconnet really does cost 10x the power of fixed-function interconnect, simply due to the excess capacitence on each switch point. This will always hurt FPGAs in the every erg counts space. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 63657
>And there is the nasty reality that SRAM burns power, and there is a >LOT of config bits for each LUT, and that Reprogrammable interconnet ... Why does SRAM have to burn power? I assume you are referring to leakage since the config memory isn't flapping after configuration. Any reason somebody couldn't implement 2 types of logic on one chip. Slow but low leakage for the config memory and fast but more leakage for the active logic? I assume it doesn't work easily with modern processing or people would be doing it already. (Maybe they already are?) I'm fishing for more theory or long term ideas. [In the old days, lots of people used SRAM rather than DRAM because it used less power - no refresh.] -- The suespammers.org mail server is located in California. So are all my other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited commercial e-mail to my suespammers.org address or any of my other addresses. These are my opinions, not necessarily my employer's. I hate spam.Article: 63658
"A.y" wrote: > in one line what is the general procedure to extract the best > performance(important) > as well as runtime from the tool in case of high density designs ? Hard to do this in one line! :-) Performance --------------------------------------- First, what is "performance". For some designs it is about placing the required functionality in the smallest and least expensive device. For others it is purely about speed. And, of course, there's the middle ground, where give and take is king. I'll take "performance" to mean speed, MHz, clock rate. In general terms, I think you start exploring "best performance" when you've hit a limit somewhere. For example, if you are doing a design one a VII running at 20MHz there's probably little need to waste any time doing anything beyond specifying PERIOD for good measure. As the clock rates increase and the complexity of the design grows you may hit spots where the "insert a coin and pull the lever" path simply does not work. Should you hit that wall, the first place to look is your HDL. There are ways to describe circuits that simply don't translate into an implementation that will run very fast at all. Search the NG and 'net for HDL coding style references. The prior step was about HDL coding styles that don't synthesise very well, not about choosing a particular solution or circuit, if you will. Once you get decent synthesis, if performance isn't sufficient the next question is: Is the chosen way to solve the problem the best (from a performance (speed) standpoint). For example, if you are trying to add a couple dozen values, a pipelined adder will run substantially faster than a parallel adder (at the expense of latency). For a new design the above two steps would proably be swapped as you'd want to zero-in on a good approach to solving a problem first and then make sure that the HDL implements it efficiently. For an existing design that needs fixing, you may have to take them in the sequence I presented. If you did the above and still can't achieve the performance requirement for your design you need to go in at a different level. There are simple things you can do that might make a big difference, in no particular order: - Do you have any FF's that should be in the IOB's? - Increase tool effort levels - Over constrain the PERIOD specification - Identify false paths and "TIG" them - Multi-cycle paths - Going far/wide/fast? Can you insert additional FF's in the path? - Consider device-specific resources (example: use registered multipliers) - Can you fold combinatorial logic into fast embedded rom lut's? Beyond this you have to get into RPM/Floorplanning mode. I think I can say that RPM's are the better way to go. Area constraints can be problematic and, in an evolving design, there can be a bit of a chicken-and-egg scenario. Hierarchically built RPM's done in HDL, of course, is the best approach from many standpoints. The subject of RPM's is wide and deep. In order to maximize performance you need to acquire a full understanding of the routing resources and how to use them. Just 'cause the layout looks good on the screen it doesn't mean that it will run the fastest. Many, many hours (days, months, years?) of work are required in order to fully understand this topic. <business hat on> Depending on your design's constraints you might be better advised to move up to a faster device rather than undertake layout at this level. Purists might cringe, but I think that most will agree that sometimes it is better to spend more money on a chip and get the design out the door than to try to optimze a design to death just for the sake of superb engineering. <business hat off> There's also the idea of using FPGA's properly. Something that I see come up frequently are wide, parallel and slow designs that consume lots of chip resources (both logic and routing). The significance of this in terms of design performance (speed) is that these highly parallel structures do eat-up routing and do push and shove other modules within the chip. Routing can be complicated by this approach to a level that it could compromise performance. Many of these wide/parallel/slow designs can be changed to serialized high speed designs to take advantage of the fact that FPGA's can run so darn fast. Do that and save lots of resources for fast parallel logic. It's amazing to me how many people never take advantage of this wonderful trick that is basicly free. A FF can run at 5 MHz just as well as 200MHz, running it slow might just be a total waste. Tool runtime (process optimization) --------------------------------------- This is simpler ... and not. You have five choices: 1- Simulation 2- Simulation 3- Simulation 4- Modular design 5- Incremental design The first three look very similar. There's a message here: Don't try to use synthesis and hardware to verify logical design and algorithms. That very time consuming. Simulation is orders of magnitude faster. Go to hardware when you know that the design (or design's components) work in simulation. A full simulation isn't required, sometimes you can use test vectors to check modules in isolation of other parts of the design. Modular design is more appropriately used within the context of a team. Incremental design works for single-developer mode. The idea is simple: limit time consuming processing to modules that have changed since the last run. Performance improvement can be dramatic. However --there's always one of those-- it is best to start a project in one of these modes as opposed to trying to convert an existing project. There are very specific inter-module (and other) requirements that must be taken into account. Check the manuals for further detail here. Hope this helps. Sorry that I couldn't fit it in one line. Not that I tried. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Martin Euredjian To send private email: 0_0_0_0_@pacbell.net where "0_0_0_0_" = "martineu"Article: 63659
"Yttrium" wrote: > ddr_clkx2 <= ddr_clkx2_out and locked; That's BAD design, as the the tools are saying. How bad? If someone working for me did that they'd be on the street faster than the PERIOD constraint on the design. Do a newsgroup/google/yahoo search for "gated clock" for more info. > so how should i implement it and what they mean with CE pin? well i know > what they mean but how should i implement it in VHDL for a VIRTEXII? This is right out of the "Language Templates" found under the "Edit" menu: -- D Flip Flop with Clock Enable -- CLK: in STD_LOGIC; -- ENABLE: in STD_LOGIC; -- DIN: in STD_LOGIC; -- DOUT: out STD_LOGIC; process (CLK) begin if (CLK'event and CLK='1') then if (ENABLE='1') then DOUT <= DIN; end if; end if; end process; -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Martin Euredjian To send private email: 0_0_0_0_@pacbell.net where "0_0_0_0_" = "martineu"Article: 63660
In article <vsclib1e0fi69b@corp.supernews.com>, Hal Murray <hmurray@suespammers.org> wrote: >>And there is the nasty reality that SRAM burns power, and there is a >>LOT of config bits for each LUT, and that Reprogrammable interconnet >... > >Why does SRAM have to burn power? I assume you are referring >to leakage since the config memory isn't flapping after >configuration. Yeup. Leakage by the bucketload... >Any reason somebody couldn't implement 2 types of logic on one >chip. Slow but low leakage for the config memory and fast but >more leakage for the active logic? I assume it doesn't work >easily with modern processing or people would be doing it already. >(Maybe they already are?) I'm fishing for more theory or long >term ideas. The problem is the FPGA places the SRAM cells (low leakage) right next to active (must be high speed) switching transistors. Any processing rule which had two Vts for the different transistors would probably require a fairly substantial spacing between the two types. This would work very well for a low power processor, with low Vt transistors (leaky but fast) in the CPU and with high Vt threshholds (low leakage but slow) transistors in the caches. EG, DRAM uses much higher threshhold (slower and less leaky) transistors, but mixed DRAM/logic processes required a fair separation between the DRAM blocks and other logic, and still resulted in slower logic. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 63661
One possibility is to use the BUFGMUX primitive to switch between idle and active. I haven't used it for this purpose yet myself, so be *sure* to read the Libraries Guide entry on the BUFGMUX in the online SW Manuals. There may be a design issue if one of the two clocks in the BUFGMUX is inactive, so do your homework before just trying it out. "Yttrium" <Yttrium@pandora.be> wrote in message news:7Erxb.46705$Wr1.1566834@phobos.telenet-ops.be... > hey, i have to use a DCM as i need multiple clocks now the problem is that > they should be de-asserted (not active) before some signal, so i need some > CE signal. i tried to solve it like this: > > ddr_clkx2 <= ddr_clkx2_out and locked; > > were locked is the lock signal from the DCM and the ddr_clkx2_out is a clock > output from the DCM and it should be 0 as long as locked is 0. > but when i use this trick it gives me the following warning when making the > bitfile (with bitgen): > > WARNING:DesignRules:372 - Netcheck: Gated clock. Clock net _n0045 is sourced > by a combinatorial pin. This is not good design practice. Use the CE pin to > control the loading of data into the flip-flop. > > so how should i implement it and what they mean with CE pin? well i know > what they mean but how should i implement it in VHDL for a VIRTEXII? > > thanx in advance, > > kind regards, > > Yttrium > >Article: 63662
Martin Euredjian wrote: > "Yttrium" wrote: > > >>ddr_clkx2 <= ddr_clkx2_out and locked; > > > That's BAD design, as the the tools are saying. Agreed, it is not good FPGA design. But the crazy thing is that Xilinx has the ability to do this relatively safely... but they don't seem to push it very hard and the tools don't automatically use it for you. Check out the BUFGCE in the V2 and S3 devices. Not portable, but usable. MarcArticle: 63663
If you know what you are doing, sure. Like most things, there's a rule and then reasons to break it. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Martin Euredjian To send private email: 0_0_0_0_@pacbell.net where "0_0_0_0_" = "martineu" "Marc Randolph" <mrand@my-deja.com> wrote in message news:dotxb.3054$NK3.1109@newssvr24.news.prodigy.com... > Martin Euredjian wrote: > > > "Yttrium" wrote: > > > > > >>ddr_clkx2 <= ddr_clkx2_out and locked; > > > > > > That's BAD design, as the the tools are saying. > > Agreed, it is not good FPGA design. > > But the crazy thing is that Xilinx has the ability to do this relatively > safely... but they don't seem to push it very hard and the tools don't > automatically use it for you. Check out the BUFGCE in the V2 and S3 > devices. > > Not portable, but usable. > > Marc >Article: 63664
walala wrote: > Considering that after solving this throughput problem, the next > bottleneck will be a 1GB memory that I need... I wonder if the graphic > card has 1GB cache/memory inside it? Since a lot time it needs to do > triple-buffling, I guess... it should have a high speed huge memory, > right? Graphics cards use AGP, now x8 that means AGP x8 interface for 2.1 GB/sec bandwidth Since this is MUCH faster than PCI bus they handle huge amounts of data a lot better. BUT suppose someone builds a AGP x8 board with a fast FPGA. Then you suddenly have reduced the data transfer bottleneck. Suppose you use a Xilinx Pro, or add a PowerPC as an option together with monitor out circuits. Then you will get quite an interesting board that could be used to run the X server for Unix/Linux :-) But you can still buy PCI graphic boards... /RogerL -- Roger Larsson Skellefteå SwedenArticle: 63665
It is possible to use single bit signals al long as you define them as a std_logic_vector (0 downto 0). This vector can be converted to a single signal by the statement signal <= vector(0); Mark "Erik Markert" <sirius571@gmx.net> schreef in bericht news:bpv074$9aj$1@anderson.hrz.tu-chemnitz.de... > Hello Tobias, > > Tobias Möglich wrote: > > Hello, > > > > Is there someone who has experiences with designing a dual port RAM. > > I use the device Spartan-IIE (XC2S300E). But it should be simular with > > other devices (e.g. Virtex, Spartan 3, etc) > > I know there is a Synthesis Template in "Xilinx ISE Foundation". > > I used that template without problems. But I assume that you can't use > single-bit signals as RAM-data-inputs. > > some Code: > > multibit <= singlebit1 & singlebit2; > > RAM_P:PROCESS(clk) > BEGIN > if rising_edge(clk) then > if write = '1' then > ram(writeaddress) <= multibit; > end if; > readaddress <= read_address_in; > end if; > END PROCESS; > > data_out <= ram(readaddress); > > Synthesis now builds block RAM (synchronous read). For distributed RAM > read can be asynchronous. > > HTH > > Erik > -- > \\Erik Markert - student of Information Technology// > \\ at Chemnitz University of Technology // > \\ TalkTo: erma@sirius.csn.tu-chemnitz.de // > \\ URL: http://www.erikmarkert.de // >Article: 63666
The launcher message is normal for the PCI core There is no source code for this core, so the ngdbuild will take the netlist instead of compiling the source code. This is correct. It seems a bit strange that the slashes in your command line are not the same. I can not find any reason for the abnormal program termination... Mark "Dean Armstrong" <daa1@NOSPAMcs.waikato.ac.nz> schreef in bericht news:ee81539.-1@WebX.sUN8CHnE... Hi All, I am trying to synthesise the Xilinx example "ping" PCI LogiCORE using Synplify Pro 7.0 and Xilinx ISE 5.2. I operate Synplify as detailed in the LogiCORE PCI Implementation Guide, and then start the ISE Project Navigator to implement the design. ISE fails at the translate stage with the output shown at the bottom of this post. I am unclear as to where the error is in this process. The second launcher message about PCI_LC_I.ngo seems a bit suspect, but does not seem to be an error. I would appreciate any help that anyone could give me on getting past this. I'm more than a little bit puzzled after playing with all the options I can find to no avail. Regards, Dean Armstrong. Started process "Translate". Command Line: ngdbuild -quiet -dd e:\working\wireless\vhdl\pci\ping\synthesis/_ngo -uc E:/working/wireless/vhdl/pci/xc2s100fg456_32_33.ucf -sd E:\working\wireless\vhdl\pci\vhdl\src\xpci -p xc2s100-fg456-6 pcim_top.edf pcim_top.ngd Launcher: "pcim_top.ngo" is up to date. Reading NGO file "e:/working/wireless/vhdl/pci/ping/synthesis/_ngo/pcim_top.ngo" ... Reading component libraries for design expansion... Launcher: The source netlist for "PCI_LC_I.ngo" was not found; the current NGO file will be used and no new NGO description will be compiled. This probably means that the source netlist was moved or deleted. abnormal program termination ERROR: NGDBUILD failed Reason: Completed process "Translate".Article: 63667
"valentin tihomirov" <valentinNOMORESPAM@abelectron.com> wrote in message news:bq5af5$1vgngs$1@ID-212430.news.uni-berlin.de... > It is a student's task. Commertial devices are given as an example; however, > any interesting topic can be chosen. > Valentin, Is it a question, a request or what? Are you looking for an article or for a topic to write an article? /MikhailArticle: 63668
"Muthu" <muthu_nano@yahoo.co.in> wrote in message news:28c66cd3.0311262012.2808420c@posting.google.com... > Hi, > > I am working with PCI-X Interface, which is 64bits wide and 133Mhz. > > But I have some problem in getting this much frequency in Xilinx FPGA? > > Regards, > Muthu > > Hi, Muthu, So you mean you cannot get such speed as claimed? -WalalaArticle: 63669
> PCI-X-3.0-1066MHz, 40 Gb Ethernet will be twice as fast as > PCI-X-533MHz > Dear Koja, I still have a small question: you said 40GB Ethernet is twice as fast as PCI-X-533MHz... but why 1) we have not seen any Ethernet inside PC? 2)Why don't we use that in our Lan other than our currently using 100M ethernet... maybe that's very expensive price... am I right? thanks a lot, -wAlalaArticle: 63670
"Kolja Sulimma" <news@sulimma.de> wrote in message news:b890a7a.0311260523.6d9f4f77@posting.google.com... > Let the cube be given by three normal vectors n1 to n3 and six points > p1 to p6 on the six planes. > (Actually you can use the same point multiple times) > Assume your rays start in the origin and are given by a vector r of > length 1. > Then the interscetions with the first plane happens at a distance d of > d1= n.p1/(n1.r)= n.p1 * (1/(n1.r)) > See http://geometryalgorithms.com/Archive/algorithm_0104/algorithm_0104B.htm#Line-Plane%20Intersection > Then you order the planes according to d. > If the ray does not cross the three front planes first, the cube is > missed, otherwise the difference between the fourth and the third > distance is the length of the intersection. > > So for each ray you get three devisions, four multiplications and a > couple of minmax cells. > (Many ore optimizations due to symmetries possible.) > > With integers you should be able to do that in a small Spartan-III in > a pipeline a lot faster than you can get data into the chip. > > With floating point numbers it should be still very fast in an FPGA, > but the design gets a lot more complicated and larger. > > Have fun, > > Kolja Sulimma Thanks a lot, Koja, Very informative,,... I need to digest your answer... -WalalaArticle: 63671
"Nicholas C. Weaver" > In article <vsclib1e0fi69b@corp.supernews.com>, > Hal Murray <hmurray@suespammers.org> wrote: <snip> > >Any reason somebody couldn't implement 2 types of logic on one > >chip. Slow but low leakage for the config memory and fast but > >more leakage for the active logic? I assume it doesn't work > >easily with modern processing or people would be doing it already. > >(Maybe they already are?) I believe foundries offer this already. Maybe Austin L could confirm if Xilinx are using this ? In noise immunity topics, we've seen the point made that the CONFIG cells have significantly different, and better, noise immunity ( so config corruption is less likely than logic corruption). > > I'm fishing for more theory or long term ideas. > > The problem is the FPGA places the SRAM cells (low leakage) right next > to active (must be high speed) switching transistors. > > Any processing rule which had two Vts for the different transistors > would probably require a fairly substantial spacing between the two > types. Why ? Sure, more steps will be needed - but spacing ? There would be a case for really pushing the Speed rules for LOGIC, but going for MAX Yield (ie slightly relaxed geometries) on the CONFIG cells (lots more of them, _and_ they are not 'picosecond paranoid' ). There would be some trade off on CONFIG time, and leakage Current in a variable threshold design. I think this could be checked experimentally - drop Vcc, and LoadCLK, and plot Config verify fail Vcc/Freq curve. Then create a LOGIC fabric shifter, and do the same for it. -jgArticle: 63672
An article. I'm looking for an article with an interesting topic.Article: 63673
Hi, I have a question about the Xilinx Timing Analyzer (trce). Does it report the I/O timing at the die pad or the package pin / ball? I believe the 5.x and earlier versions of the software referred timing to the die pads, whereas 6.x seems to be taking the flight delay of the package into account. Could someone from Xilinx please confirm whether this is the case? Thanks, AllanArticle: 63674
hello martin, accept thanks from my heart for devoting your precious time to write this all. > Hard to do this in one line! :-) pl forgive my english, it was 'my' one line of question :( I do understand it was very vague question .. but let me clear my intentions .. I can't say how good are my architectural decisions, hdl coding style, synthesis strategy .. and control over xilinx tools.. but i have spent a good time over this design .. but things boil down to my eagerness to do the "superb engineering" .. throw off the business hat for the time being .. may be i can do optimize and reduce the usage for my device ... but u can assume that i won't stop adding something more into it ! > I'll take "performance" to mean speed, MHz, clock rate. u r quite correct .. > - Do you have any FF's that should be in the IOB's? > - Increase tool effort levels > - Over constrain the PERIOD specification > - Identify false paths and "TIG" them > - Multi-cycle paths > - Going far/wide/fast? Can you insert additional FF's in the path? > - Consider device-specific resources (example: use registered > multipliers) > - Can you fold combinatorial logic into fast embedded rom lut's? pretty good things ,, possibly known to me :), and i am searching if something is possible.. i do have some spare mults and brams ! > Beyond this you have to get into RPM/Floorplanning mode. I think I can > say that RPM's are the better way to go. Area constraints can be > problematic and, in an evolving design, there can be a bit of a > chicken-and-egg scenario. Hierarchically built RPM's done in HDL, of > course, is the best approach from many standpoints. The subject of RPM's is wide and deep. In order to maximize performance > you need to acquire a full understanding of the routing resources and how to > use them. Just 'cause the layout looks good on the screen it doesn't mean > that it will run the fastest. Many, many hours (days, months, years?) of > work are required in order to fully understand this topic. yes, this_is_what my engineering senses want to explore more ... i think i am a newbie in the rpm world .. would you suggest some good reading for heirarchical rpms ? and not only for now this will help me in future also.. > There's also the idea of using FPGA's properly. Something that I see yes some tricks i have learnt/been learning like using srls, cy logic, mults, luts, brams in xilinx in a better way.. thanks to techexclusive and this forum .. > Tool runtime (process optimization) > --------------------------------------- these all are excellent concepts but you pull your hair when you do a very small thing and the tool shouts with the errors and warnings .. then you open the browser search for the workaround .. if found understand it .. and try to do if it works .. everything fine then !? u basically shift ur figures of cpu usage from ngdbuild/map/par to netscape/opera/iexplorer :) (just kidding) but equally true .. we do learnt a lot from mistakes :) thanks you very much for all your inputs. regards ay
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z