Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Jonathon, I broke the lead of a CK722 (first Raytheon commercial xistor) that belonged to my older brother (who worked at Raytheon at the time building commercial radios and radars). Wow, did I get in trouble! Ever see a Western Electric WE24 Transistor? It had a pinch off tube on the top cap so they could adjust the point contacts for the best "gain" before closing it. Austin nospam wrote: > "Jonathan Bromley" <jonathan.bromley@doulos.co.uk> wrote: > > >"Martin Euredjian" <0_0_0_0_@pacbell.net> wrote in message > >news:9CTia.448$5k3.33562682@newssvr21.news.prodigy.com... > >> That's one heck of a way to date myself isn't it!!! At least nobody (yet) > >> suggested a Carbon-14 age test! :-) > > > >And how do you think I recognise the part numbers ? :-) > > > >The first semiconductor component I ever bought, as a > >nervous youngster, was an OC71. It cost me three weeks' > >pocket money. > > Just checked my old transistor draw and found an OC71, OC45s, and an AC128. > > Heh, remember the sorrow when having used it in so many breadboard circuits > a leg broke off :(Article: 54151
> > Funny you should say that ... I've done just that in the past for other (non > programming) applications where a symbolic description gets to the point > quickly. AutoCAD/AutoLISP is quite powerful in that regard. When I get > some time I'm going to learn ObjectARX which takes it one step farther. > > Another thought I had was that of ASCII (or text) based programming being an > outmoded modality. Back in college I was fortunate enough to have a Physics > professor who was hell-bent for a language called APL. He made a deal with > the CS department to allow students to take an APL class he concocted > instead of FORTRAN77 and get equivalent credit. I was one of those > students. > > Many describe APL as hieroglyphics because it uses various symbols to > represent functions and concepts. Ken Iverson created APL back in the early > 60's at IBM to describe the IBM360's hardware ... interestingly enough. So, > I've often wondered if the APL approach could be taken for a > "next-generation" HDL. I really think symbols are incredibly powerful ... > think of mathematics, if you had to use the word "integral" in place of the > stretch "S" shape, for example. I've seen some threads complaining about > such things as the "begin" and "end" tokens in Verilog and asking if they > could be replaced with the C-style "{ }" braces. > > Extending the use of something like Excel to generate code could be an > interesting interim apprach. Geometric (limited to rows and columns, of > course) relationships could be exploited to denote parallelism,etc. > > Just thinking out loud. > Wow, and I here I thought you'd accuse me of talking crazy talk! Apparently there were enough subconscious clues to pin down exactly the essence of your character and extrapolate from your use of two-dimensional thought processes and geometric CAD programming...or just dumb luck.Article: 54152
"arkaitz" <arkagaz@yahoo.com> wrote >. The thing is that when I synthezise > the project, the compiler changes the name of my internal signals or > nets. If an unchanged part of a design generates the same topology of nets and primitives, there is no excuse for synthesizers not giving them stable and repeatable names. Your synthesis vendor is just being lazy, or at any rate, insensitive to real world customer needs. (Of course if a source change <way over there> subtly causes the generated logic to differ <over here>, then you're still out of luck. But many small edits do not have a global effect.) http://www.fpgacpu.org/log/aug02.html#macrobuilder: "The current floorplanner allows manual control of placement, but when you change your source code, too often the synthesizer scrambles up all your synthesized net and instance names and the floorplanning process must be repeated. " "I wonder how Macro Builder helps address this issue, which always seemed to me to be an unnecessary synthesis induced problem. There is no reason synthesis tools cannot fabricate synthesized names, in a deterministic way, based upon the topology of the circuit nearby. Certain synthesized instances, such as slices of registers, can be given repeatable names derived from identifiers in the source code. (The 7th bit of register pc is of course pc<7>.) And a gate between register X and register PC[7] could be named $lut_X_PC<7> or $lut_hash("X_PC<7>"), instead of $lut1234. Under this regime, no matter what changes you make to other parts of your design, this particular synthesized LUT is going to be repeatably named $lut_X_PC<7>. " "(Well, if it were my problem to solve, first I'd lobby the synthesis tools vendors to synthesize to repeatable, canonicalized names. I strongly suggested doing exactly this to Xilinx two years ago. This helps with incremental design respins too. ...) " Jan Gray, Gray Research LLCArticle: 54153
Austin, thanks for the response. I have revised my clock switching based upon an old e-mail from you to the list and it now works properly. http://groups.google.ca/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=3CC02A7B.C26976CC%40xilinx.com Austin Lesea <Austin.Lesea@xilinx.com> wrote in message news:<3E8B425A.EB8E32D3@xilinx.com>... > Douglas, > > Both clocks must be transistioning in order for the mux to operate. > > This is because the mux will always switch without creating a glitch, or > runt pulse (it is not a simple mux). > > Austin > > douglas fast wrote: > > > Hello, > > > > I am using Virtex-II, Xilinx ISE 5.2 SP1. > > > > I would like to use the bufgmux to select between one of two possible > > clocks. One clock comes in through an IBUFGDS_LVDS_33 block (LVDS > > clock). The second clock is generated by an on board DCM driven from > > an external crystal. The bufgmux is configured like this: > > > > BUFGMUX xxx ( > > .O(clk_out), // output clock > > .I0(ext_lvds_clk), // external lvds clock > > .I1(int_dcm_clk), // internally generated clock > > .S(select) // select line > > ); > > > > If ext_lvds_clk and int_dcm_clk are present and stable, then I can > > configure the bufgmux to select and output either input clock. > > However, if ext_lvds_clk is not present (pins are floating), then the > > bufgmux does not select the correct output. I get nothing out at all. > > More accurately, I think it stays stuck in a state where I0 is > > output. > > > > Is there some unwritten (or obscurely written) requirement for both > > clocks to be present in order for the bufgmux to function correctly? > > > > Thanks, > > > > DougArticle: 54154
Hi, I've seen in many posts, ppl are talking about using distributed arithmetics and SLR16 in xilinx devices; mostly in Ray posts. I tried to find something to read about it using google but I had no chance. So basically I have only some guesses in my mind but no documents in hand. I've only a limited background about Xlinx devices and all I know is that the LUTs can be put in a SLR16 mode that would let us feed in a 16 bit value and then again by changing the mode back to normal, we'll have a LUT based on this value. So, what is this DA+SLR16 all about? Where and when should (or shouldn't) we use this technique? Where the LUTs' content come from? How should we store them (in BlockRAM?) How it's done in real practice, i.e. how we can code it in HDLs? Can we write a module in a HDL using this technique and yet preserve the portability (to other devices without such a feature, like Altera's)? How about simulations? Best Regards ArashArticle: 54155
In article <3e8c65e8$1@epflnews.epfl.ch>, Arash Salarian <arash dot salarian at epfl dot ch> wrote: >Hi, > >I've seen in many posts, ppl are talking about using distributed arithmetics >and SLR16 in xilinx devices; mostly in Ray posts. I tried to find something >to read about it using google but I had no chance. So basically I have only >some guesses in my mind but no documents in hand. I've only a limited >background about Xlinx devices and all I know is that the LUTs can be put in >a SLR16 mode that would let us feed in a 16 bit value and then again by >changing the mode back to normal, we'll have a LUT based on this value. >So, what is this DA+SLR16 all about? Where and when should (or shouldn't) we >use this technique? Where the LUTs' content come from? How should we store >them (in BlockRAM?) How it's done in real practice, i.e. how we can code it >in HDLs? Can we write a module in a HDL using this technique and yet >preserve the portability (to other devices without such a feature, like >Altera's)? How about simulations? Distributed Arithmatic, its a technique for building VERY highly compact bit-serial/mutibit-serial arithmetic structures in FPGAs. http://www.andraka.com/distribu.htm The SRLs are very useful in this form for slightly coarser grained operations, eg a 4 bit by 4 constant bit multiply requires 8 LUTs to implement. If the constant never changes, you use a LUT. If the constant seldom changes but DOES, you shift in a new series of values, either generating them on the fly or from a premade larger table. Likewise, if the constants change all the time, a single SRL can be used to hold a 16 bit constant, with the address lines being used to determine which is selected, when creating a bit serial structure. -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 54156
I regret writing: "Your synthesis vendor is just being lazy, or at any rate, insensitive to real world customer needs." That was too strong. Generally speaking, tools developers are passionate, dedicated, work long hours, are inundated with complaints when things don't work, and yet their great work and productivity advances are usually taken for granted. I apologize. Jan Gray, Gray Research LLCArticle: 54157
Just to verify--DA does take more cycles, yes? I.e., it's a way of trading gates for speed? -Kevin > > Distributed Arithmatic, its a technique for building VERY highly > compact bit-serial/mutibit-serial arithmetic structures in FPGAs. > http://www.andraka.com/distribu.htm > > The SRLs are very useful in this form for slightly coarser grained > operations, eg a 4 bit by 4 constant bit multiply requires 8 LUTs to > implement. If the constant never changes, you use a LUT. If the > constant seldom changes but DOES, you shift in a new series of values, > either generating them on the fly or from a premade larger table. > > Likewise, if the constants change all the time, a single SRL can be > used to hold a 16 bit constant, with the address lines being used to > determine which is selected, when creating a bit serial structure. > -- > Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 54158
Offset constraints are often omitted without repercussion. If the clock speeds are low and/or the data in the FPGA is registered at the IOBs, not specifying the offset constraints usually results in no errors, even though it's probably not the best practice. But imagine this: your period is 10ns, and your inputs are not registered at the IOB. The PAR tool assumes that the input data appears at the pin right at the time the clock changes, and that it has 10ns (minus FF setup) to get that data through some logic and to the FF input. That is a very optimistic assumption, since the part on the board that generates the signal has a CLK->OUT time, and there is a delay on the trace. Imagine this total delay is 5ns; now you are missing timing by half a period! You must set an input offset constraint of 5ns so the PAR tool knows it only has 5ns to work with. -KevinArticle: 54159
Allan Herriman wrote: > Hi, > > Does anyone have experience with VHDL tool support for bit vectors > (or vectors of other types) that have lots of elements? > > I'm thinking of using one for a generic or a constant (not a signal) > to hold the initialisation value for a Xilinx block ram (18432 bits). > > I'm interested in both simulation and synthesis. Initialization for a block RAM occurs when the binary image is loaded into into the device. The only way to to control this from VHDL source is with device specific instances and attributes or by inferring a ROM by declaring a constant array of vectors of an appropriate size. If you intend to have a state machine load one port of a dual-port RAM after initialization, you could use a single vector, but an array of vectors would be easier. Sim and synth tools can handle vector widths of several hundred thousand bits, up to natural'high. It is not clear to me how this would help solve your problem. -- Mike TreselerArticle: 54160
In article <Ch_ia.3203$ug3.3574@rwcrnsc51.ops.asp.att.net>, Kevin Neilson <kevin_neilson@removethistextattbi.com> wrote: >Just to verify--DA does take more cycles, yes? I.e., it's a way of trading >gates for speed? Correct. However since you are so small & parallel, you get high throughput (higher, actually, because your pipelined stages can be smaller). -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 54161
hi folks ... I know that the Cyclone C6 and C20 are on the shelfs at arrow ... could someone from Altera give a heads up on the status and availability of the C12Q240 , thanks KBArticle: 54162
Austin Lesea <Austin.Lesea@xilinx.com> writes: > I broke the lead of a CK722 (first Raytheon commercial xistor) that > belonged to my older brother (who worked at Raytheon at the time > building commercial radios and radars). If you haven't already read it, you might want to look up the article on the CK722 in the March 2003 issue of IEEE Spectrum. And, of course: http://www.ck722museum.com/Article: 54163
> hi folks ... I know that the Cyclone C6 and C20 are on the shelfs at > arrow ... could someone from Altera give a heads up on the status and > availability of the C12Q240 , thanks KB Alteras web site says: availability now. But I'm too waiting for first samples... My distri. told me today I can expect the EP1C6 in may. Martin -------------------------------------------------------- JOP - a Java Processor core for FPGAs now on Cyclone: http://www.jopdesign.com/cyclone/Article: 54164
>From Altera: Dear LeonardoSpectrum(TM)-Altera Software User, Beginning April 1, 2003, Altera will no longer include LeonardoSpectrum-Altera software licenses in software subscriptions or with free software products. Altera will, however, continue to provide Mentor Graphics(R) ModelSim(R) products for subscription customers and will continue to work with Mentor Graphics as a key EDA partner to support Altera(R) devices in commercial Mentor Graphics products. Existing LeonardoSpectrum-Altera software licenses will continue to work and Altera will continue to provide support for existing LeonardoSpectrum-Altera software licenses distributed by Altera. ----------------------- Firstly, my hope was April fool's joke, but I'm not sure if they would be so cruel.... so Oh dear, Altera's software solution just got worse. They have dropped Leonardo synthesis tools. Leonardo was largely essential for getting decent synthesis over the full range of VHDL constructs. Oh well, lets hope Altera just sells a PAR solution for a lower cost, because I really need to get away from the below par bug-ridden stuff I keep having to deal with. Cue Altera telling me its all now wonderful. Like I've heard a hundred times already. I'm getting worked up because I've spent years overcoming tool deficiencies and was comfortable with Leonardo + ActiveHDL + Altera PAR. I however am bitter that I need to pay Altera for a 'full' design flow just for PAR. Altera - I don't want your tools with half-baked promises of things working. How about a feature freeze and a bit more time aiming for robust and reliable. By the way, I'm not suggesting that Leonardo didn't have bugs, or other manufacturer's tools are any better, just that after years of fighting and getting a least worse low-cost solution I now have to either pay for Leonardo (fair enough especially if Altera's PAR can be provided on its own for much lower $$$) or have more Altera synthesis grief. Paul Baxter, my opinions are my ownArticle: 54165
acher@in.tum.de (Georg Acher) wrote in message news:<b6hgrt$meg$1@wsc10.lrz-muenchen.de>... > In article <20540d3a.0304030624.123cca37@posting.google.com>, > stenasc@yahoo.com (Bob) writes: > |> Hi all, > |> > |> I decided to have a go at writing a small fft for learning purposes. > |> (32 pts). I seem to be having problems with the fixed lengths for the > |> data. > |> > |> The twiddle factors are 16 bits. Initially, for butterfly stage 1, I > |> read in 16 bit input data (it has already been position reversed), do > |> the multiplication by the twiddle factor (I know you don't need to > |> multiply by the twiddle factor in the first stage as it is 1, but just > |> to keep it consistant).This brings the results for the real and > |> imaginary outputs to 32 bits. I truncate the 32 bits to the most > > A multiplication of two signed values with 16bit give a 31bit signed value. You > have only one sign bit in the output, but two in the input data ;-) > > Can that be the problem? Hi Greg, Thanks for replying. You are correct when you state that I need only 31 bits for the multiplication. However, the 32nd bit in my case is also the correct sign. OK, I may have one more bit than I need, but would that really make much of a difference. In my case the outputs after several stages go to all binary zeros ie 00000000000000....0000b or all ones 1111111111111....111b. I end up getting no values in the final stage at all. My input data consists of two sine waves added together. When I convert this binary data to floating point and do the FFT of it, using a fft executable for PC, I can clearly see the two peaks. With the FFT code, that I am attempting to create, I get zeros, or all one's at the output. I think it is the quantization of the data between butterfly stages, but I can't narrow it down at this moment. BobArticle: 54166
For a discussion on Distributed Arithmetic (DA), see the DA tutorial page on my website. It is basically a method to take advantage of the LUT structure of an FPGA to get more multiplies per unit area. It is traditionally bit serial, but can also be parallelized to compute as many bits in parallel as you are willing to pay for in area, all the way up to a full parallel implementation. It is basically a rearrangement of the partial products in a sum of products to reduce the hardwre. SRL16 mode is a Xilinx exclusive feature that allows the LUT to be used as an addressable shift register. It is essentially a 16 bit serial-in parallel out shift register with a 16:1 selector to select one of the 16 shift register taps as the output. When the data is not being shifted in, it behaves exactly like a LUT so that the data currently inside is addressed with the 4 LUT inputs. The SRL16s are useful for delay queues, in which case the address is generally fixed and data is shifted in on every clock. For DA applications, normally one uses LUTs containing all the possible sums of 4 constants times the 4 1 bit inputs (see my website). A drawback to DA in the traditional sense is that the coefficients are fixed since they are set at compile time. By replacing them with SRL16's you can provide an ability to reload the coefficients without having to reconfigure the FPGA. It requires some sort of loader circuit to obtain the new data from somewhere (a memory or from some host processor), as well as something to compute the LUT contents, which have 16 entries for each set of four coefficients (this can be precomputed or computed inside the FPGA). Support for SRL16's in other than a simple delay queue is somewhat sparse in the synthesis tools, so we just instantiate the SRL16's. Instantiation has an added advantage of letting you do the floorplanning in the code. The SRL16 is unique to Xilinx, so even if the code were written without primitives, it wouldn't be very portable. SRL16s can also be used for reorder queues by modulating the address lines while data is shifting in. Arash Salarian wrote: > Hi, > > I've seen in many posts, ppl are talking about using distributed arithmetics > and SLR16 in xilinx devices; mostly in Ray posts. I tried to find something > to read about it using google but I had no chance. So basically I have only > some guesses in my mind but no documents in hand. I've only a limited > background about Xlinx devices and all I know is that the LUTs can be put in > a SLR16 mode that would let us feed in a 16 bit value and then again by > changing the mode back to normal, we'll have a LUT based on this value. > So, what is this DA+SLR16 all about? Where and when should (or shouldn't) we > use this technique? Where the LUTs' content come from? How should we store > them (in BlockRAM?) How it's done in real practice, i.e. how we can code it > in HDLs? Can we write a module in a HDL using this technique and yet > preserve the portability (to other devices without such a feature, like > Altera's)? How about simulations? > > Best Regards > Arash -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 54167
Not necessarily, you can go to a full parallel implementation, in which case it does not take more cycles. In the full parallel implementation, the area savings is reduced to just about zero over a traditional MAC architecture. Kevin Neilson wrote: > Just to verify--DA does take more cycles, yes? I.e., it's a way of trading > gates for speed? > -Kevin > > > > Distributed Arithmatic, its a technique for building VERY highly > > compact bit-serial/mutibit-serial arithmetic structures in FPGAs. > > http://www.andraka.com/distribu.htm > > > > The SRLs are very useful in this form for slightly coarser grained > > operations, eg a 4 bit by 4 constant bit multiply requires 8 LUTs to > > implement. If the constant never changes, you use a LUT. If the > > constant seldom changes but DOES, you shift in a new series of values, > > either generating them on the fly or from a premade larger table. > > > > Likewise, if the constants change all the time, a single SRL can be > > used to hold a 16 bit constant, with the address lines being used to > > determine which is selected, when creating a bit serial structure. > > -- > > Nicholas C. Weaver nweaver@cs.berkeley.edu -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 54168
Jan Gray wrote: > > If an unchanged part of a design generates the same topology of nets and > primitives, there is no excuse for synthesizers not giving them stable and > repeatable names. Your synthesis vendor is just being lazy, or at any rate, > insensitive to real world customer needs. I'm well aware of the value of preserving names as far into the synthesized EDIF as can be managed. I spend a lot of time struggling with computer generated names, trying to debug code generation. Not to mention trying to optimize results. It's not always so easy as you imply. If the synthesizer makes any non-trivial attempts to optimize, then all kinds of global scrambling may take place. Often I find it hard to *choose* a name for a nexus, much less pin it down in the face of dead code elimination. Not saying I can't do better. At least for my synthesizer, many signal names do survive through synthesis, and I am looking into placing them in the right places in generated EDIF. Things get odd when multiple signals are linked together, though. Suggestions are welcome. > Well, if it were my problem to solve, first I'd lobby the synthesis > tools vendors to synthesize to repeatable, canonicalized names. I'm suitably lobbied, thank you. You're not the first, you see. -- Steve Williams "The woods are lovely, dark and deep. steve at icarus.com But I have promises to keep, steve at picturel.com and lines to code before I sleep, http://www.picturel.com And lines to code before I sleep."Article: 54169
Thanks to everyone for the responses. We've decided for now to go with the Cyclone part, EP1C6, since it has more pins in the PQFP package. It seems to me that the non-BGA packages seem to be heavily deemphasized. I think that this is bad for many reasons: - protos harder to make and cost more - must use more board layers - rework difficult if not impossible - probing is much harder I hope that large PQFPs will be around for many years to come. MattArticle: 54170
I absolutely agree with Rudi. As someone who has done high-speed board design, fpga based logic design and full custom ic design, I must say that most of what you wrote in the above article strikes me as bordering on nonsense. I have enjoyed your tutorials on signal integrity and find most of your postings in this group very usefull, so I am amazed at how willing you are to abandon reason and truth in favour of Xilinx marketing. I hope they pay you REALLY well ... Ljubisa Bajic, VLSI Design Engineer, Oak Technology, Teralogic Group --------------My opinions do not reflect those of my employer.------------- Austin Lesea <Austin.Lesea@xilinx.com> wrote in message news:<3E887532.31FE90B4@xilinx.com>... > Nicholas, > > The original question was "why would anyone spend $4,000." > > Good question. No one does. Well almost no one. I suppose the 'monster' > FPGAs (like the 2V8000, or the 2VP100) will always command a premium until > they, too, are mainstream - just a question of demand). > > 1M+ gates up until now has certainly been much less than $4,000 (even in small > quantities). > > Now we are talking about even less money for 1M+ gates in 90 nm. > > ASICs are all but dead except for those really big jobs that can afford the > $80M++ price tag to develop them. Or those jobs where low current is required > (ie cell-phones). > > Even televisions don't sell enough to afford some of the new ASIC pricetags. > Think about it. An "appliance" doesn't sell in large enough volume to have > its own ASIC. > > The recent EETimes article on IP at these geometries was especially telling. > Integration of IP at 130 nm and 90nm is a hightmare......etc. etc. etc. The > 80M$ figure above was from that article. > > So 'cheap' ASICs are stuck at 180nm (and above). But with 90nm FPGAs we are > three or more techology steps ahead (.15, .13, .09), and that makes us a > better deal. > > Austin > > "Nicholas C. Weaver" wrote: > > > In article <3E886139.96955371@xilinx.com>, > > Austin Lesea <Austin.Lesea@xilinx.com> wrote: > > >Really? > > > > > >Have just annouced 90nm shipped samples. > > > > > >http://biz.yahoo.com/prnews/030331/sfm087_1.html > > > > > >so I would suspect that you might want to get in touch with another > > >distributor.... > > > > > >Might find 1+ million gates for a whole lot less.... > > > > Thats "250,000 quantities at (the end of?) 2004". :) > > > > -- > > Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 54171
In article <e8fd79ea.0304031738.3c3bb265@posting.google.com>, Matt Ettus <matt@ettus.com> wrote: >Thanks to everyone for the responses. We've decided for now to go >with the Cyclone part, EP1C6, since it has more pins in the PQFP >package. > >It seems to me that the non-BGA packages seem to be heavily >deemphasized. I think that this is bad for many reasons: > >- protos harder to make and cost more >- must use more board layers >- rework difficult if not impossible >- probing is much harder > >I hope that large PQFPs will be around for many years to come. Don't count on it by much. Pin bandwidth is a precious comodity, and a PQFP-style package's bandwidth is a function of SQRT of board footprint, while BGAs are linear in the size. Do you know if anyone's succeeded in Toaster-oven-soldering of BGA packages? -- Nicholas C. Weaver nweaver@cs.berkeley.eduArticle: 54172
On Thu, 03 Apr 2003 09:57:06 -0800, Mike Treseler <tres@fluke.com> wrote: >Allan Herriman wrote: >> Hi, >> >> Does anyone have experience with VHDL tool support for bit vectors >> (or vectors of other types) that have lots of elements? >> >> I'm thinking of using one for a generic or a constant (not a signal) >> to hold the initialisation value for a Xilinx block ram (18432 bits). >> >> I'm interested in both simulation and synthesis. > > >Initialization for a block RAM occurs when the binary >image is loaded into into the device. The only way to >to control this from VHDL source is with device >specific instances and attributes or by inferring >a ROM by declaring a constant array of vectors >of an appropriate size. Not quite the "only way". In simulation, one needs to use the INIT_XX generics on the block rams. The attributes are ignored. >Sim and synth tools can handle vector widths >of several hundred thousand bits, up to natural'high. Somehow I can't see any tool working with a vector length of 2 ** 31 - 1. (At least not under versions of Windows that have problems allocating more than 2Gbyte of ram to a process.) I have seen std_logic_vectors of several hundred thousand bits used in Modelsim quite successfully. Does anyone have any other practical experience? I was rather hoping someone from Synplicity would reply. I know they read this news group. Thanks, Allan.Article: 54173
This sounds like it might be a job for a DSP--you can do a MAC per cycle and you can certainly do more than 40MHz. You'd put one matrix in X memory and one in Y memory. You'd need 216 cycles for the MACs and a little overhead to store the result every 6 MACs. The FPGA would be more appropriate for more hardcore processing, like a matrix multiply in less than 216 cycles. I don't know if the Altera part has multipliers, but if you were to use a Virtex-II part, you could use four of the 18x18 multipliers and a couple of adders to do a 36x36 multiply, and in this manner you could do one (possibly pipelined) MAC per cycle. I'd put the matrices in different RAM blocks so you can access them simultaneously, or at least use both ports of a dual-port block to access them simultaneously. -Kevin "jerry1111" <jerry1111@wp.pl> wrote in message news:b6f2ol$826$1@atlantis.news.tpi.pl... > > You didn't say how fast it has to run. > > Sorry - I was so absorbed with this problem that I forgot to write it: > > 10us with 40MHz clock => 400 cycles would be perfect, but it's almost > impossible - at least from my current point of view. > I'll be happy getting any reasonable speed of course, but this timing gives > an idea how should it be.... > > I have to do 6 muls and cumulate them (6 adds) for each element from result matrix. > Matrix is 6x6, so 36 elements gives 216 muls and 216 adds.... > > Now I'm thinking about some sort of parralel operations, but it's not so > simple because of storing data in ram. The best would be to store each row > from A in separate block, columns from B in another 6 blocks, multiplying with > 6 parallel logic pieces and feed results to FIFO. Each row/column is 6x36 bits - > - maybe it would be better to make some pipelinig... > > Now I have 10 sheets of paper with various solutions, but I'd like to hear > opinions from 'another point of view'.... > > Selected device is EP1C6 from Altera. > > PS: Sorry for my bad english, but I'm more reading than writing. > > -- > jerry > > "The day Microsoft makes something that doesn't suck is probably > the day they start making vacuum cleaners." - Ernst Jan Plugge > >Article: 54174
Kevin, FYI, Altera's Stratix and Stratix GX devices come with a number of DSP blocks. Each can be configured as 4 9x9 multipliers, 2 18x18, or 1 36x36. Each also incorporates two adder/subtractor/accumulator units, and can do various permutations of multiply and add operations, with optional accumulate and pipelining. They are most definitely appropriate for a function such as this. You can read up more at http://www.altera.com/products/devices/stratix. Regards, Paul Leventis Altera Corp. "Kevin Neilson" <kevin_neilson@removethistextattbi.com> wrote in message news:Jk8ja.358031$S_4.435318@rwcrnsc53... > This sounds like it might be a job for a DSP--you can do a MAC per cycle and > you can certainly do more than 40MHz. You'd put one matrix in X memory and > one in Y memory. You'd need 216 cycles for the MACs and a little overhead > to store the result every 6 MACs. > > The FPGA would be more appropriate for more hardcore processing, like a > matrix multiply in less than 216 cycles. > > I don't know if the Altera part has multipliers, but if you were to use a > Virtex-II part, you could use four of the 18x18 multipliers and a couple of > adders to do a 36x36 multiply, and in this manner you could do one (possibly > pipelined) MAC per cycle. I'd put the matrices in different RAM blocks so > you can access them simultaneously, or at least use both ports of a > dual-port block to access them simultaneously. > > -Kevin > > "jerry1111" <jerry1111@wp.pl> wrote in message > news:b6f2ol$826$1@atlantis.news.tpi.pl... > > > You didn't say how fast it has to run. > > > > Sorry - I was so absorbed with this problem that I forgot to write it: > > > > 10us with 40MHz clock => 400 cycles would be perfect, but it's almost > > impossible - at least from my current point of view. > > I'll be happy getting any reasonable speed of course, but this timing > gives > > an idea how should it be.... > > > > I have to do 6 muls and cumulate them (6 adds) for each element from > result matrix. > > Matrix is 6x6, so 36 elements gives 216 muls and 216 adds.... > > > > Now I'm thinking about some sort of parralel operations, but it's not so > > simple because of storing data in ram. The best would be to store each row > > from A in separate block, columns from B in another 6 blocks, multiplying > with > > 6 parallel logic pieces and feed results to FIFO. Each row/column is 6x36 > bits - > > - maybe it would be better to make some pipelinig... > > > > Now I have 10 sheets of paper with various solutions, but I'd like to hear > > opinions from 'another point of view'.... > > > > Selected device is EP1C6 from Altera. > > > > PS: Sorry for my bad english, but I'm more reading than writing. > > > > -- > > jerry > > > > "The day Microsoft makes something that doesn't suck is probably > > the day they start making vacuum cleaners." - Ernst Jan Plugge > > > > > >
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z