Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Hi Kumaran, My replies below. Vaughn kumaran@trlabs.ca (Kumaran) wrote in message news:<40f2d3e9.0311200917.48805134@posting.google.com>... > Hi Vaughn, > Thanks for the information. I tried DSE, and I am geeting and error > saying "Error: DSE does not support the ACEX1K family!". Apparently it didn't in QII 3.0. Sorry. I tried it in 4.0 and it worked, so I assumed it did in 3.0 as well, but apparently it was only recently ported to the ACEX 1K family. QII 4.0 will be out in 2-3 months. Until then, you can try manually changing the fitter seed, as a poor man's DSE. It's unlikely to help you much if the problem is poor synthesis making you very far from timing though. > > 1. Turn off "optimize i/o cell register placement for timing". That > > option enables some aggressive optimizations for registered IO > > timing. These optimizations can hurt internal clock frequencies > > (fmax). If you are meeting your IO (Tsu & Tco) timing, but missing > > a clock timing constraint, turning this option off may help. > Yes, Iam meeting the IO (Tsu & Tco) timing. I turned of the "optimize > i/o cell register placement for timing", but it did not make any > difference. Ok. Either you don't have registered IOs (in which case that algorithm does nothing) or it's working well and not hurting your Fmax. > > 4. It sounds like you are using Quartus for synthesis. Make sure > > you have "default logic synthesis style" = speed selected (this is > > the default). It's under Settings->Default Logic Options > Did you mean optimization technique? The optimization technique is set > for speed. Yes, it is called Optimization Technique in QII 3.0. Sorry, the version on my computers is 4.0 (in development), and we renamed the option there. > > 5. Make sure you've set all your timing constraints. > I have set all my timing constraints. > > I could synthesize my code for Acex 1K100 using Leonardo without any > timing issues. I used MAx +2 to complie the EDIF file generated by > Leonardo. By using quartus, is there anyway that I could get the same > speed that I got from LS and max +2? Thanks for your time. Since you're using a different synthesis tool and a newer place and route tool, there's no way to guarantee the same timing. Mostly QII beats MaxPlus2 handily for 10K designs, but there are always a few designs it does worse on. If your speed is considerably worse, I suspect you are seeing a synthesis quality problem, since you switched synthesis tools. If you look at the critical path from the timing analyzer output, and double-click on each hop in the critical path, you can see if 1. There are a lot of hops on the critical path (more than there were in MaxPlusII). That probably indicates poor synthesis. You may be able to re-code your HDL to get better synthesis. Or you could synthesize with Leonardo and bring that into Quartus, since it sounds like you got good results with Leonardo. 2. There aren't that many hops (logic levels) on the critical path, but they use a lot of interconnect on each hop. This indicates poor place & route, and changing the fitter seed may help. If you generate a .qar of your design and send it in I'll file a software problem report on it, and hopefully we can find some optimization flaw. Sending in what timing you achieved in MaxPlus2 would also be helpful. But without seeing the design I'm out of ideas. Regards, Vaughn AlteraArticle: 63551
Vaughn, Thank you very much for your help! I was confused when reading apex.pdf: somewhere says 8 "dedicated clock and input pins", somewhere says 4 "dedicated clocks". Now I know the difference is that 4 "fast input pins". One question: to access the "dedicated fast resources", do I simply define an internal net as "global signal"? During compilation, I saw msgs like "promote signal XXX to global signal automatically". Does it mean it uses dedicated fast resources for that signal already? > > The two resources aren't much different. The dedicated clocks are > driven by dedicated input pins, while the fast dedicated networks can > be driven by bidirectional IOs or internal signals from the FPGA > fabric. So most people just consider this 8 dedicated clocking / > asynchronous clear networks. I just did an experiment: use pin Y34 (dedicated clock pin) to drive a few small modules and I see clock skew of less than 0.1 ns; then I use pin B19 (fast1) to drive the same modules, this time I see clock skew of more than 1.1 ns; (skew observed from layout/floorplan view) Do you think this skew will be too large for the hold-time of the flip-flops on fpga? > > clk1p and clk2p aren't connected together, so you can send 4 signals > in through the dedicated clock pins. I wasn't clear in my first post: the FPGA is sitting on a DSP board, so the clk1p and clk2p are connected. I probably will cut them apart. > > The FAST pins drive the dedicated FAST networks, which can be used as > another 4 clock networks: Same concern as I mentioned above: is > 1.1ns skew too large for them to be used as clock networks? thanks! YiArticle: 63552
Philip, after thinking about the problem once more, I hate to admit that, yes, you are right (as usual). I still do not believe, though, that inserting idle time one way or the other (including cutting the transmitter's stop bit) is a solution. Consider the following: Left side: Slow (9600 Baud) Right side: Fast (9700 Baud) Both sides use e.g. 8N2 for Tx and 8N1 for Rx. At some point in time, Left see's its buffer filling up and hence skips a few stop bits here and there (using 8N1) in order to compensate this. Left is now faster that Right, despite of the clock rates. As a consequence, Right sees its buffer filling up and skips stop bits (using 8N1) as well. This continues until both sides transmit with 8N1 all the time; at this time Left will loose data. Thus, there must be some kind of understanding between Left and Right, which of the two is the "clock master", that ultimately controls the transmission speed. Unfortunately this is sometimes not possible, for instance in symmetric configurations. /// JuergenArticle: 63553
Philip Freidin wrote: > The following solutions can be made to work: <A to F snipped> G) copy the scheme used in the TI MSP430, effectively a DDS. Using this arrangement gets you a lot more speed, and you can take the TI documentation as a spec and save a ton of time... If you do it, please forward the results to Xilinx for a writeup as an App Note ;-)Article: 63554
Hi All, Thanks for all the suggestions and comments, I guess the next thing is too simply buy one and try it out, Regards, Hans. www.ht-lab.com "louis lin" <louis@zyflex.com.tw> wrote in message news:bpsik2$fpn@netnews.hinet.net... > > There's a tip on ISE5 of Xilinx: > Please set the environment variable XIL_IMPACT_LPT2_BASE_ADDRESS to the > base address used by your USB converter. > It's worked on a PCI printer card in ISE5 / Windows 2000. > However, I don't know if it will work on USB converter. > > > > "Amontec Team, Laurent Gauch" <laurent.gauch@amontecDELETEALLCAPS.com> :3FC1B602.7000303@amontecDELETEALLCAPS.com... > : Hans wrote: > : > Hi All, > : > > : > I have recently received a new all singing all dancing (well nearly :-) > : > laptop but unfortunately it no longer has a serial or parallel port (Dell > : > 5150). In order to use my serial and parallel download/program cables I need > : > one of those USB to serial/parallel converters. > : > > : > Do they work (i.e. simulate a real parallel/serial port) or am I asking for > : > trouble? > : > > : > What about a PCMCIA parallel/serial card? > : > > : > Thanks > : > > : > Hans > : > > : > www.ht-lab.com > : > > : > > : Hi, > : > : We tried months ago, but with only troubles. They do not come from > : hardware, but from software ... all is depending on how the drivers are > : written. > : > : Amontec annouced a new USB pod for Xilinx and Altera JTAG access. The > : POD will be ready for Q1 2004. > : > : Laurent > : www.amontec.com > : > : ------------ And now a word from our sponsor ------------------ > : Want to have instant messaging, and chat rooms, and discussion > : groups for your local users or business, you need dbabble! > : -- See http://netwinsite.com/sponsor/sponsor_dbabble.htm ---- >Article: 63555
The SIIE is a .15u process shrink. Austin Jim Granville wrote: > "Symon" wrote > >>"Jim Granville" wrote >> >>>Following the recurring thread of 5V IO, >>>the loss thereof, and 'the price of progress', here are some >>>of the newest numbers from the uC industry : >>> >>> Philips LPC2129 Spartan IIE >>>General 256KF/ARM Advanced FPGA >>> >>>Vcore 1.8V 1.8V >>>Vio <5.5V <3.6V >>>Icctyp 10uA 10mA >>>IccMAX <500uA <200mA >>> >>>Icc numbers are Static, ie represent standby power levels. >>>FPGA of similar core Vcc is chosen, and smallest IIe device is chosen >>>to avoid too much die-area skew effect. >>>-jg >> >>Hi Jim, >> I haven't used this Philips part but, just so I know you're not >>comparing apple and oranges, this Philips part supports upteen different >>I/O standards? From 3.3V LVTTL to 2.5V LVDS at 622M? >> cheers, Syms > > > Sorry if this was not clear - this is from the uC industry, so it is > comparing > 'like process' capability (~ 0.18u) - what the silicon DOES is, of course, > quite different. > However, the fundamentals like IO and leakage are a bit more portable and > it's > good to compare real numbers when a vendor claims some 'spec erosion' is > the 'price of progress' - implication: 'We couldn't do anything about > that'. > > -jg > >Article: 63556
hi, I have some elementary doubts regarding flooplanning .. hoping a quick guidance here ..(not a hw) In xilinx floorplanner or manualy (using ise 5.1) 1. Can i assign flexible area constraints to individual modules in a design.. i mean, can i say a module shud fit in this much area of some shape .. without specifying absolute slice locations in area group ? .. and 2. can structures other than rectangle be specified ? i guess i am missing something obvious :( ? thanks a lot, ayArticle: 63557
Allan, 4.05v is the abs max. You will feel much better beiung within the recommended operating conditions. The 4.05V is really not a change. The old overshoot and undershoot amounted to the same m=number (3.75v Vcco + a 0.3 undershoot = 4.05V relative to the Vcco pin). Austin Allan Herriman wrote: > On Sat, 22 Nov 2003 18:43:41 +1100, Allan Herriman > <allan.herriman.hates.spam@ctam.com.au.invalid> wrote: > > >>On Fri, 21 Nov 2003 14:25:04 -0800, "Symon" <symon_brewer@hotmail.com> >>wrote: >> >> >>>Hi Peter, >>> OK, but the signals come from another board, they're ringy (is that a >>>word?) and I'm concerned about over/undershoot, I'd prefer to give myself >>>the safety margin of 3.3V VCCO. >> >>I believe you have *less* safety margin with the 3.3V VCCO. >>The abs max voltages on the pin are >> >>Gnd + 3.6V to VCCO - 3.6V. >> >>With a 3.3V VCCO, you can exceed the abs max voltage (with your >>"ringy" signals) before the catch diode conducts. >> >>With a 2.5V VCCO, the diodes will stop you from exceeding the voltage >>rating, but you may exceed the current rating when driving from a 3.3V >>device with stiff outputs. >>A small value series resistor fixes that problem. (You probably need >>a series resistor for signal integrity reasons anyway.) >> >>On my current board I use all 2.5V signalling on the FPGA (not >>including the LVDS stuff). >>There was a legacy 3.3V level processor interface, and I used a number >>of 74ALVC164245 to handle the level translation. > > > > Did I say that? The latest version of the data sheet has changed the > abs max voltage to 4.05V (up from 3.6V). I wish I could have found > that out *before* I added all those 74ALVC164245s to the board. > > Regards, > Allan.Article: 63558
I've got a cyclone design: Video (SA7113) to LCD Converter .. "Wang Feng" <fwang11@pub3.fz.fj.cn> schrieb im Newsbeitrag news:bpn4lh$3oa$1@news.yaako.com... > Are there any reference designs for video frame memory control logic > to work with Philips SAA7111 decoder? > > email to fwang11@pub3.fz.fj.cn > > Thanks, > > Wang, Feng > > >Article: 63559
I have a process which has 2 loop statements.. only one seems to execute all the time.. the first one.. the second never gets a chance.. why is that..Article: 63560
Hal Murray <hmurray@suespammers.org> wrote: >> You bright up a good subject, and you're absolutely correct that if you >> continuously send data from one serial port at 9600.01bps to a receiver >> at 9600, sooner or later there must be a buffer overflow. ... > > I think you are missing a key idea. The receiver has to make > sure that it will tolerate early start bits. That is the receiver > has to start looking for a new start-of-start-bit right after > it has checked the middle of the stop bit rather than wait unitl > the end of the stop bit to start looking. Unless your (slightly slower) transmitter also has the capability of producing shortened start (or stop) bits, how those this approach 'fix' the problem? If the date rates are, say, 9601 received BPS and 9600 transmitted BPS, detecting early start bits just buys you one extra bit interval before your overrun your buffers, doesn't it? ---JoelArticle: 63561
"A.y" wrote: > 1. Can i assign flexible area constraints to individual modules in a design.. Depends on your definitions of "flexible", but, yes, you can identify all the logic for a given module and assign an area constraint to it. > i mean, can i say a module shud fit in this much area of some shape .. You can't say that a module "should fit" a given area. What you say is something like "don't put logic for this module outside of this area". The area has to be large enough for the grouped logic. Easiest way to find out is to use the Floorplanner to gather the logic you want to constrain. While creating the area constraint graphically, it will tell you what percentage of the selected logic fits within the rectangle you are defining. > without specifying absolute slice locations in area group ? .. and Well, you do define an area within the chip. Not sure what you mean here. If you want something that will maintain a given layout as you move it about a chip you have to use RPMs. Even then, crossing certain boundaries gets complicated (embedded multiplier columns). > 2. can structures other than rectangle be specified ? Sure, you can specify multiple rectangles to form other shapes. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Martin Euredjian To send private email: 0_0_0_0_@pacbell.net where "0_0_0_0_" = "martineu"Article: 63562
A.y wrote: >hi, > I have some elementary doubts regarding flooplanning .. >hoping a quick guidance here ..(not a hw) >In xilinx floorplanner or manualy (using ise 5.1) >1. Can i assign flexible area constraints to individual modules in a design.. >i mean, can i say a module shud fit in this much area of some shape .. >without specifying absolute slice locations in area group ? .. and > You can specify that the module(s) should be grouped together, but you can't specify the size and shape unless you give it a location. >2. can structures other than rectangle be specified ? > You can put multiple rectangles together to form T, L, U, etc. shapes. PACE is the easiest way to create these area groups. Steve >i guess i am missing something obvious :( ? >thanks a lot, >ay > >Article: 63563
GPG <peg@slingshot.co.nz> wrote: > MAXIMUM error is .5 bit over one frame. In your case frame = 10 bits. > .5/10 = 5% I remember the story at a former employee who was using an HP serial test set at 38400 bps to debug a data stream that went to a slot machine controller. He kept having inexplicable data corruption errors until he looked at the timing very closely on a 'scope and discovered that -- due to an 'off by one' error in some register of the UART in the microcontroller generating the data stream -- the actual bit rate was something like 10% off. It had gone undiscovered for years since the microcontroller at the receiving end was using the same code! ---Joel KolstadArticle: 63564
In article <CXvwb.9478$ws.845858@news02.tsnz.net>, Jim Granville <no.spam@designtools.co.nz> wrote: > Following the recurring thread of 5V IO, >the loss thereof, and 'the price of progress', here are some >of the newest numbers from the uC industry : > > Philips LPC2129 Spartan IIE >General 256KF/ARM Advanced FPGA > >Vcore 1.8V 1.8V >Vio <5.5V <3.6V >Icctyp 10uA 10mA >IccMAX <500uA <200mA > >Icc numbers are Static, ie represent standby power levels. >FPGA of similar core Vcc is chosen, and smallest IIe device is chosen >to avoid too much die-area skew effect. A: Even a small FPGA has so many config bits and active gates that static leakage becomes vastly significant, while the Phillips part has only 16KB of SRAM (most of the storage is flash). B: There is a tradeoff between static leakage and performance. A 4-stage CPU pipeline with a max frequency of 60 MHz is incredibly biased towards low power & very low leakage, not high performance. Heck, the LEON sparc core in the .25 micron FPGAs will run at ~30+ MHz, and thats a fully synthesized, no optimization at all design! C: Biasing towards lower leakage also allows higher Vio, as now you have thicker oxide layers. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 63565
Try contacting the Xilinx hotline, they're usually willing to help look at designs and can save you some time by at least identifying the cause of the no-fit.Article: 63566
Hi, I'm looking for a VHDL example code to implement a DDR memory in a Altera 'Stratix'. (not a controler), with use of RAS, CAS, etc... There is many example of memory in Megawizard of Quartus (DP-RAM, FIFO), but I can't find DDR. I would like to have this DDR include in specific memory block (M-RAM or M512 or others). Thanks for your help, bhbArticle: 63567
Xilkernel comes with a number of examples to illustrate its usage. These examples can be found both in the install area for EDK, and in your local project directory. Please see the examples in the test/arch/microblaze area to see how this is done. For thread creation, see print_thread.c Hope this helps. -- Mohan Frank wrote: Can anyone explain me how to use the xilkernel. I added in the mss file the following BEGIN LIBRARY parameter library_name = xilkernel parameter library_ver = 1.00.a parameter max_procs = 10 parameter max_readyq = 10 parameter sched_type = 2 parameter config_sema = true parameter config_msgq = true parameter config_thread_support = true parameter config_shm = true parameter config_malloc = true parameter config_mutex = true parameter process_table = ( (0xffe00000, 28)) parameter msgq_table = ( (4, 1) ) parameter mem_table = ((4,10), (8,10)) parameter shm_table = ((100)) END but when I try to use thread_create or sys_thread_create, I get an undefined reference to it when linking. Thanks, FrankArticle: 63568
Frank wrote: Hello, I have build a bootloader which is located in block ram. Now I want to download my final application to sdram and execute it. If I'm correct, I've to make a linker script in order to make this possible. Not necessarily. You can simply specify a different start address for your boot loader and your application on the mb-gcc command line. See the Makefile_mb.sh and other Makefile*.* files related to MicroBlaze in the Xilkernel install area. Besides this, I want to use the xilkernel in my application, but not in the bootloader, is this possible? Yes. I guess I have to convert the application.elf file to a binary file in order to be placed into sdram by the bootloader?! Depends on your bootloader. If your bootloader expects binary then you have to. I have seen bootloaders that use other formats as well, such as SREC or some application specific encoding. How are the interrupts handled? Is the interrupt handler from the bootloader used (because it default jumps to address 0x18)? Can I install a new interrupt handler which is located in my application?! a lot of questions, I searched at the forums, but there are not much examples available. I'm sure there a people who already did this before. There are some examples of Xilkernel usage in the install area itself (search for print_thread.c). Please help, thanksArticle: 63569
<Sujatha> wrote in message news:ee814a9.-1@WebX.sUN8CHnE... > I have a process which has 2 loop statements.. only one seems to execute all the time.. the first one.. the second never gets a chance.. why is that.. Hey Sujatha, Could be lots of things, post a snippet of your code, maybe we can help. Or try comp.lang.vhdl? cheers, Syms.Article: 63570
juergen Sauermann wrote: (snip) > I still do not believe, though, that inserting idle time one way or the > other (including cutting the transmitter's stop bit) is a solution. > Consider the following: > > Left side: Slow (9600 Baud) > Right side: Fast (9700 Baud) > > Both sides use e.g. 8N2 for Tx and 8N1 for Rx. > > At some point in time, Left see's its buffer filling up and hence skips > a few stop bits here and there (using 8N1) in order to compensate this. > Left is now faster that Right, despite of the clock rates. > > As a consequence, Right sees its buffer filling up and skips stop bits > (using 8N1) as well. > > This continues until both sides transmit with 8N1 all the time; at this > time Left will loose data. As far as I know, asynchronous transmission was intended to be between two devices, such as a terminal and a computer, though more likely two terminals in the early days. The two stop bits were required by machines that mechanically decided the bits. (The Teletype (R) ASR33, for example.) Using stop bits as flow control seems unusual to me. Electronic UARTs (no comment on mechanical ones) sample the bit at the center of each bit time. For a character with no transitions (X'00' or X'FF') timing error can accumulate for the duration of the character. The STOP bit is the receivers chance to adjust the timing, and start over with the new START bit. With a 5% timing error, which is very large for a crystal controlled clock, the stop bit could start 0.45 bit times early, but the receiver will still detect it at the right time, and be ready to start the next character. The timing for each character is from the leading edge of the START bit. This allows for difference in the bit clock rate between the transmitter and receiver. It is unrelated to any buffering or buffer overflow problems that may occur. -- glenArticle: 63571
Hi, does anybody know if there are some working examples using this interface to xilinx fpga for hardware debugging? kind regards, thomasArticle: 63572
"Philip Freidin" wrote > <snip> > And Philip writes: > Modifying the local transmited character to be a non-standard length > by changing the length of the stop bit on the fly as buffer > over-run is about to occur is not a good idea, as you don't > know the details of how the receiver that is listening to it was > designed, and it may not be very happy to see the next start bit > before it is finished with what it expects is a full length stop > bit, but it is not. UARTs look for the START edge, from the _middle_ of the STOP bit. With x16 clocking, typically that gives 8 possible time slots for earlier start. I would agree that a half-bit jump in STOP, as the OP first suggested, is NOT a good idea, but fractional (1/16 quantized ) STOP changes are valid and safe. <snip> > > BUT this is not a solution to the original poster's problem! > The problem still exists because the remaining .5 bit is still > going to arrive, the data is being sent with a slightly faster > clock than the transmitter is able to retransmit the character. > If there is no line idle time between the end of the inbound > stop bit and the next inbound start bit, the system will > eventually have an over-run problem, no matter how big the > input buffer. The closer the two clock rates, and the bigger > the buffer, the longer it takes to happen, but it will happen. Yes, true if the stop bit is 'whole bit' quantized. CAN be avoided if the TX can move the START edge as needed, both left and right, in 1/16 steps. Something like +/-4 sixteenths would leave design margin. > F) > Be sneaky. Most UARTs can be set for 1 , 1.5 , or 2 stop bits. > Set the far end transmitter for 8N2 (1 start, 8 data, 2 stop). > Set your receiver and transmitter for 8N1 (1 start, 8 data, 1 stop). > This works, because stop bits look just like line-idle. This > effectively implements (B), but is localized to the initialization > code for the far end transmitter. Yes, by far the simplest, and most practical solution. However, this is comp.arch.fpga, and here we can design any UART we like, including one that can handle 100% traffic loading, single stop bits, and 1-2% region clock skews. ! :) To illustrate this, look at the SC28C94 UART data, this from info on their 16 possible STOP BIT options: MR2[3..0] = Stop Bit Length 0 = 0.563 1 = 0.625 2 = 0.688 3 = 0.750 4 = 0.813 5 = 0.875 6 = 0.938 7 = 1.000 8 = 1.563 9 = 1.625 A = 1.688 B = 1.750 C = 1.813 C = 1.875 E = 1.938 F = 2.000 -jgArticle: 63573
"Nicholas C. Weaver" <nweaver@ribbit.CS.Berkeley.EDU> wrote in message news:bq01k6$29db$1@agate.berkeley.edu... > In article <CXvwb.9478$ws.845858@news02.tsnz.net>, > Jim Granville <no.spam@designtools.co.nz> wrote: > > Following the recurring thread of 5V IO, > >the loss thereof, and 'the price of progress', here are some > >of the newest numbers from the uC industry : > > > > Philips LPC2129 Spartan IIE > >General 256KF/ARM Advanced FPGA > > > >Vcore 1.8V 1.8V > >Vio <5.5V <3.6V > >Icctyp 10uA 10mA > >IccMAX <500uA <200mA > > > >Icc numbers are Static, ie represent standby power levels. > >FPGA of similar core Vcc is chosen, and smallest IIe device is chosen > >to avoid too much die-area skew effect. > > A: Even a small FPGA has so many config bits and active gates that > static leakage becomes vastly significant, while the Phillips part has > only 16KB of SRAM (most of the storage is flash). Not sure I follow this. Are you saying only SRAM determines leakage ? I would have expected the total CMOS P-N pairs to determine leakage, and that should be largely die area proportional (with possible factors of Vcc disable of whole blocks, if that is done ) > B: There is a tradeoff between static leakage and performance. A > 4-stage CPU pipeline with a max frequency of 60 MHz is incredibly > biased towards low power & very low leakage, not high performance. > Heck, the LEON sparc core in the .25 micron FPGAs will run at ~30+ > MHz, and thats a fully synthesized, no optimization at all design! I think the 60MHz is dictated primarily by FLASH access, and that speed, inclusive of flash, is at the 'high performance' end for uC. > C: Biasing towards lower leakage also allows higher Vio, as now you > have thicker oxide layers. To say it is a trade-off is correct. It it becomming is more common to see variable oxide process offered - it could well be being done now, in FPGAs to give 3.6IO on 1V cores. The point I am illustrating is that FPGAs have made impressive strides in Speed, and features/dollar over the last 5 years, but that has come at some other performance cost and compromise. If they really want FPGAs to replace ASICs, or FPGA cores to expand markets, that will be the next focus. Intel is a putting a LOT of R&D into leakage control, as they realise it is restricting their expansion and deployment. Seems a natural 'next step' for (eg) Xilinx to take their 90nm Spartan-3 devices, and tune for leakage first, and speed second. Same tools, very largely the same mask sets, but new customers - those who would look at 200mA max, 10mA typ and say 'pity, could have used that..' -jgArticle: 63574
"Andras Tantos" <andras_tantos@tantos.yahoo.com> wrote in message news:<3fbe66ee$1@news.microsoft.com>... > Hi! > > > I guess this is a ray-tracing problem... But I need to do this task in as > > high as possible speed/throughput. Here is my problem: > > > > Suppose I am given 25 rays and I am given a 3D cube and all parameters of > > these rays and cube are given... > > > > I need to compute the length of the intersecting segment of the rays with > > this cube as fast as possible. If some rays completely fall outside of the > > cube, then it outputs 0, otherwise gives the length. > > - Are the vertices of the cube parallel to the coordinate axles? > - Is the anything special about the cube (size, orientation, rotation, > location, etc.)? > - In what format are the rays and the cube defined? > > > I heard there are some very good graphic card with accelerator... and I > > heard about the bus bandwidth to be as high as 500MHz... I am not sure if > > they have good accelaration function for doing my task? > > Possibly no. First, you would have to get data *back* from the accelerator > which is something they are not designed for. As someone said to me once, > they operate in a 'write and forget' mode. Second, none of the accelerators > I know of do ray-tracing. However your question is not a complete ray-trace > problem, so you might be able to tweak the functions of an accelerator to > give you your answer. > > > I also think of doing this using an FPGA which is hooked onto a Intel PC > > with Linux... I don't know the details, but I guess it uses PCI or other > bus > > to interact with the CPU and serve as an coprocessor... > > That's a possiblity. You can find PCI FPGA prototyping cards for this > purpose. > > > BTW, if you need to process 1GB of data (assume that's the total amount of > traffic) you would need at least at least 7.75 seconds just to transfer the > data over a 33MHz PCI bus, not counting other PCI traffic, and other issues. > If that's too slow, you would need a) 66MHz b) 64bit c) PIX-X bus and of > course a PC that supports these. > > > I want to know which method is better? > > Depends on many things: > - Speed requirement (as fast as possible is not enough) > - Numerical Precision > - Price concerns > - Project deadlines > - More precise description of the problem (see above) > > > Considering that after solving this throughput problem, the next > bottleneck > > will be a 1GB memory that I need... I wonder if the graphic card has 1GB > > cache/memory inside it? > > None I've heard of. > > > Since a lot time it needs to do triple-buffling, I > > guess... it should have a high speed huge memory, right? > > Huge means 128-256MB nowadays. > > > I also don't know what is the maximum processing speed of a high-end > > graphical card comparing with a high end FPGA implementation? > > Impossible to answer in general. > > > Can anybody give me some comments/suggestions/advice/hints/pointers on > this? > > What I would suggest is to write a SW only solution for your problem. That > would give you, or anyone else a pretty good definition of the problem. > After this, you probably will be able to state much better questions. > > Regards, > Andras Tantos Hi, Andras, Thank you very much for your answer! I guess the first thing I need to make myself clear is that what is the essence of this problem? Is it a ray-tracing problme or collision detection problem? I need to identify the name of the problem first then I can go out and search for similar application cases... Can you help me on that? Thanks a lot, -Walala
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z