Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Eric Smith wrote: > Rudolf Usselmann <russelmann@hotmail.com> writes: >> Unfortunately I am using a "unsupported OS" (FC3) ... >> So I guess I am out of luck using 7.1 64 bit ... >> >> I have tried making sure all xilinx environment variables >> are not set, and that the installation directory is empty - >> I am still getting seg. fault ... > > Strange, I'm using FC3 and it works for me, with one inconsequential > problem. It needed older versions of some libraries before it would > load, but after I installed those it was fine. I posted about that on > March 14: > > http://groups-beta.google.com/group/comp.arch.fpga/msg/4b592cb14bad823f Eric, I read your post and did try the things you suggest, but like you, I too was NOT successful, installing the 64 bit version of ISE 7.1. Best Regards, rudi ============================================================= Rudolf Usselmann, ASICS World Services, http://www.asics.ws Your Partner for IP Cores, Design, Verification and SynthesisArticle: 82551
You can find everything that you want to know at the Philips site: http://www.semiconductors.philips.com/markets/mms/protocols/i2c/ I recommend that you download and read the specification first. You can probably safely ignore anything related to multimaster or high-speed mode, but be aware that these exist. They also have discussion forums where you can ask questions. -KeithArticle: 82552
Anthony Mahar wrote: > Nju Njoroge wrote: > > Interesting question for the "Monitoring Capsule Design" paper... they > state they monitor behavior "between the CPU and L1 Dcache." Did they > explain how they were able to do this, since the PPC405 and L1 are part > of the same hard core? > You are right--the CPU and the L1 cache are in the same hard core, so we don't have access to the interface inside the CPU core and the cache. As I described in my previous post, they placed their monitor at the interface of the L1 cache port that are usually connected to the PLB. Thus, instead of connecting their CPU to the PLB bus, they connected the PPC core to their monitor, which is then connected to the PLB. NNArticle: 82553
On Wed, 13 Apr 2005 23:14:58 -0400, Mark Jones wrote: > James Beck wrote: >> In article <upWdnTeAPOxSqcDfRVn-sA@buckeye-express.com>, abuse@127.0.0.1 >> says... >> >>>praveen.kantharajapura@gmail.com wrote: >>> >>>>Hi all, >>>> >>>>This is a basic qustion regarding SDA and SCL pins. >>>>Since both these pins are bidirectional these should pins need to be >>>>tristated , so that the slave can acknowledge on SDA. >>> >>> >>> No, both pins are not bidirectional. Only the master device drives the SCK >>>line, and all slaves must leave their SCK's as input. >> >> >> Not true, a slave device can extend a cycle through clock stretching and >> the only way to do that is for the slave device to be able to hold the >> clock line low. >> >> http://www.i2c-bus.org/clockstretching/ >> > > Explain that to a Noob. > > Please. A slave device can use the clock as a primitive flow control mechanism. If the slave takes some amount of time to process a byte, it can prevent the master from starting the next byte by simply holding the clock low. The master cannot clock the data until the clock is released by the slave. I2C clocks are not really clocks, they are 'data valid' signals. They don't have to go at any a particular rate, and aren't constrained by anything except setup and hold times for the devices. Regards, Bob MonsenArticle: 82554
Anthony Mahar wrote: > Nju Njoroge wrote: > > Anthony Mahar wrote: > > > >>Hello, > >> > >>Is there a way to do performance monitoring on the PPC405 in the > > > > Virtex > > > >>II Pro? I am specifically interested in cache hits. > >> > >>I have wedged my own device between the CPU's instruction and data > > > > PLB > > > >>interfaces and can currently get cache misses. But I need to find a > > > > way > > > >>to determine cache hits of an application running under an operating > >>system. > >> > >>If it was stand alone I could figure that information out by the > > > > number > > > >>of load and store instructions, but this is an operating system with > >>context switches, interrupt handlers, etc. > >> > >>Is there a way to gather this information? There did not seem to be > > > > any > > > >>performance monitoring registers as seen with newer PowerPC and x86 > >>systems. Can the trace port be used to passively monitor execution > > > > for > > > >>load/store instructions? > > > > > > Unfortunately, I have few answers to your questions. However, I know of > > a research group in Georgia Tech that is designing/designed a memory > > access monitor, which sounds similar to yours. You may want to > > correspond with them to exchange notes. I learned of their monitor at > > the HPCA 2005 FPGA workshop. Here is a link to the workshop > > http//cag.csail.mit.edu/warfp2005/. A link to the workshop > > presentations is here at > > http//cag.csail.mit.edu/warfp2005/program.html. Their presentation was > > titled "Evaluating System wide Monitoring Capsule Design using Xilinx > > Virtex II Pro FPGA". Their paper has their contact information. > > > > As for the trace port, I have used it with a IBM/Agilent RISCWatch (RW) > > box, which collects a dynamic trace of the instructions over 8 million > > CPU cycles. The main limitation is that it only works for stand alone > > apps. When you have virtual memory enabled (while running Linux for > > instance), RW uses the TLB to conduct the virtual to physical address > > translations. This is great for regular code. However, when an > > interrupt is detected, the CPU converts to using physical addresses for > > the interrupt handler. Unfortunately, RW continues to use the TLB so it > > tries to translate physical addresses, for which no "translations" > > exists, so RW is unable to resolve interrupt handler instructions. > > After this point, the trace is corrupted. In any case, if you are > > interested in learning more about RW, you can refer to this appnote > > http//direct.xilinx.com/bvdocs/appnotes/xapp545.pdf. It has links to > > all manuals for the RW box and its tools. > > > > Lastly, for my own curiosity, how difficult was it to design and debug > > your monitor? The guy I spoke to from Georgia Tech at the workshop said > > they used Chipscope to learn the protocol (along with IBM's PLB spec). > > He claims that this was a painstaking process. > > > > NN > > > > Thank you Nju, > > I am going to dig into those docs right now. > > My design was not intended to be a monitor, but an active bus > transaction modifier. On certain transactions, I have to perform > certain operations on the data going to the PPC405. This means I > selectively pass data through, or perform some higher latency operations. > > Since I am currently interested in cache-miss performance, I only count > the number of transaction requests from L1 cache. Because it is an > individual word that caused the instruction miss, all other words > retrieved in the transaction are, of course, not considered as a miss. > This makes it extremely easy to monitor the number of transaction > requests. > > While the module is an active component between the CPU and PLB, it is > very easy to add a passive monitor once you have a way to have the EDK > inject the monitor in the middle. For myself, It required some time to > understand the EDK .mpd format and effectively create a PLB-PLB bridge > (no logic, pure pass through), and there may be better ways with the > "transparent" bus format that I haven't had time to look into. But at > the time it was also my first EDK peripheral. > If I understand correctly, you are saying that your transaction modifier acts as a PLB Bus to PLB Bus bridge. So, in the EDK project, you connected the CPU to a PLB bus, then connected your module to that PLB bus and then connected another PLB bus on the other side of your pcore? CPU <->PLB Bus -> your pcore <-> PLB BUS <-> Memory (Cache/BRAM) If my understanding is correct, you in essence designed a PLB-PLB bridge, like the PLB-OPB bridge, right? In our research, we also designed a PLB to PLB bridge. Our pcore was initially a pass-through in between the two buses, then we placed our real module when we got the pass-through running. The guys from Georgia Tech, however, interfaced their monitor module directly with PPC's PLB ports, so they couldn't use EDK's abstraction of the bus protocol through the PLB IPIF module. In fact, they had to synthesize their project in ISE since EDK wouldn't support what they were trying to do. That's why they had to use ChipScope to really see what the processor does. > As for 'learning' the PLB system, I found the IBM CoreConnect Bus > Functional Model (BFM) for the PLB, with the PLB doc, to be instrumental > in observing every kind of transaction I had to handle. I think the BFM > would be far easier than using ChipScope/Docs alone. The BFM allows the > generation of almost any kind of cycle-accurate PLB transaction a master > and slave can use. > > One other model I would like to begin using is the Xilinx provided > PPC405 swift model, which will allow the same code used by the real > processor to run on the simulation swift model simulation. This will > cause PLB transactions to occur in the same way they will on the real > system, i.e. cache line fills based on the PPC405 MMU's state, etc. > In designing our pass-through, we used the swift models. I definitely recommend learning how to use them. The swift models allow you to conduct full-system simulations. As for the BFM's, we weren't able to use them for our pcore since EDK 6.3i IPIF Create/Import wizard didn't support the use of Verilog modules (7.1 now supports this). We could have hacked this by using a netlist, but you cannot pass parameters/generics into a netlist, which is a feature that is required for our pcore. I have used the BFM's for a VHDL module I worked on in the past and I agree that they too were helpful. NNArticle: 82555
On Wed, 13 Apr 2005 05:30:39 -0700, praveen.kantharajapura wrote: > Hi all, > > This is a basic qustion regarding SDA and SCL pins. > Since both these pins are bidirectional these should pins need to be > tristated , so that the slave can acknowledge on SDA. > > But i have seen in some docs that a '1' need to be converted to a 'Z' > while driving on SDA and SCL, what is the reason behind this???? > > Thanks in advance, > Praveen You need a primer in I2C. If you are the only master, and drive the SDA wire to 1, nothing bad will happen unless, the slave thinks it's supposed to ack at the wrong time, at which time you'll get a short between the master and slave. The standard specifies resistors you can add to keep the devices from getting damaged in this case. However, you CANNOT drive SCL to 1, because a slave is allowed to hold you off by driving it low. You have to notice this, so driving it high is not going to work unless you make the clock slow enough so that the slave will never hold you off. In most I2C applications, the bus and clock should only go high when nobody is pulling it low. There are 'fast' modes of I2C which may not obey this restriction. However, a device starts off in the pullup mode, and then switches over, I believe. ---- Regards, Bob MonsenArticle: 82556
Ralf Duschef wrote: > There are plenty of issues in 7.1 SP1. yes, there are :-( > Handle with care! good advice - use it, but carefully Austin Lesea wrote: > > > Have you logged this into the hotline as a case? Best way to address > > > new software glitches is to report them. Hopefully, it's not only me, reporting 'glitches' (!?!) to Xilinx... Maybe SP2 has 'stabelized'. b.t.w. - heard about 'glitches' only in HW-Design before !!! JochenArticle: 82557
Anthony Mahar wrote: > Nju Njoroge wrote: > > Anthony Mahar wrote: > > > >>Hello, > >> > >>Is there a way to do performance monitoring on the PPC405 in the > > > > Virtex > > > >>II Pro? I am specifically interested in cache hits. > >> > >>I have wedged my own device between the CPU's instruction and data > > > > PLB > > > >>interfaces and can currently get cache misses. But I need to find a > > > > way > > > >>to determine cache hits of an application running under an operating > >>system. > >> > >>If it was stand alone I could figure that information out by the > > > > number > > > >>of load and store instructions, but this is an operating system with > >>context switches, interrupt handlers, etc. > >> > >>Is there a way to gather this information? There did not seem to be > > > > any > > > >>performance monitoring registers as seen with newer PowerPC and x86 > >>systems. Can the trace port be used to passively monitor execution > > > > for > > > >>load/store instructions? > > > > > > Unfortunately, I have few answers to your questions. However, I know of > > a research group in Georgia Tech that is designing/designed a memory > > access monitor, which sounds similar to yours. You may want to > > correspond with them to exchange notes. I learned of their monitor at > > the HPCA 2005 FPGA workshop. Here is a link to the workshop > > http//cag.csail.mit.edu/warfp2005/. A link to the workshop > > presentations is here at > > http//cag.csail.mit.edu/warfp2005/program.html. Their presentation was > > titled "Evaluating System wide Monitoring Capsule Design using Xilinx > > Virtex II Pro FPGA". Their paper has their contact information. > > > > As for the trace port, I have used it with a IBM/Agilent RISCWatch (RW) > > box, which collects a dynamic trace of the instructions over 8 million > > CPU cycles. The main limitation is that it only works for stand alone > > apps. When you have virtual memory enabled (while running Linux for > > instance), RW uses the TLB to conduct the virtual to physical address > > translations. This is great for regular code. However, when an > > interrupt is detected, the CPU converts to using physical addresses for > > the interrupt handler. Unfortunately, RW continues to use the TLB so it > > tries to translate physical addresses, for which no "translations" > > exists, so RW is unable to resolve interrupt handler instructions. > > After this point, the trace is corrupted. In any case, if you are > > interested in learning more about RW, you can refer to this appnote > > http//direct.xilinx.com/bvdocs/appnotes/xapp545.pdf. It has links to > > all manuals for the RW box and its tools. > > > > Lastly, for my own curiosity, how difficult was it to design and debug > > your monitor? The guy I spoke to from Georgia Tech at the workshop said > > they used Chipscope to learn the protocol (along with IBM's PLB spec). > > He claims that this was a painstaking process. > > > > NN > > > > Thank you Nju, > > I am going to dig into those docs right now. > > My design was not intended to be a monitor, but an active bus > transaction modifier. On certain transactions, I have to perform > certain operations on the data going to the PPC405. This means I > selectively pass data through, or perform some higher latency operations. > > Since I am currently interested in cache-miss performance, I only count > the number of transaction requests from L1 cache. Because it is an > individual word that caused the instruction miss, all other words > retrieved in the transaction are, of course, not considered as a miss. > This makes it extremely easy to monitor the number of transaction > requests. > > While the module is an active component between the CPU and PLB, it is > very easy to add a passive monitor once you have a way to have the EDK > inject the monitor in the middle. For myself, It required some time to > understand the EDK .mpd format and effectively create a PLB-PLB bridge > (no logic, pure pass through), and there may be better ways with the > "transparent" bus format that I haven't had time to look into. But at > the time it was also my first EDK peripheral. > If I understand correctly, you are saying that your transaction modifier acts as a PLB Bus to PLB Bus bridge. So, in your XPS project, you connected the CPU to a PLB bus, then connected your module to that PLB bus and then connected another PLB bus on the other side of your pcore? I assume you also used Create/Import IPIF Wizard, right. CPU <->PLB Bus -> your pcore <-> PLB BUS <-> Memory (Cache/BRAM) If my understanding is correct, you in essence designed a PLB-PLB bridge, as in the diagram above. In our research, we also designed a PLB to PLB bridge. Our pcore was initially a pass-through in between the two buses, then we placed our real RTL when we got the pass-through working. The guys from Georgia Tech, however, interfaced their monitor module directly with PPC's PLB ports, so they couldn't use EDK's abstraction of the bus protocol through the PLB IPIF module. In fact, they had to synthesize their project in ISE since EDK wouldn't support what they were trying to do. That's why they had to use ChipScope to really see what the processor does. > As for 'learning' the PLB system, I found the IBM CoreConnect Bus > Functional Model (BFM) for the PLB, with the PLB doc, to be instrumental > in observing every kind of transaction I had to handle. I think the BFM > would be far easier than using ChipScope/Docs alone. The BFM allows the > generation of almost any kind of cycle-accurate PLB transaction a master > and slave can use. > > One other model I would like to begin using is the Xilinx provided > PPC405 swift model, which will allow the same code used by the real > processor to run on the simulation swift model simulation. This will > cause PLB transactions to occur in the same way they will on the real > system, i.e. cache line fills based on the PPC405 MMU's state, etc. > In designing our pass-through, we used the swift models. I definitely recommend learning how to use them. The swift models allow you to conduct full-system simulations. As for the BFM's, we weren't able to use them for our pcore since EDK 6.3i IPIF Create/Import wizard didn't support the use of Verilog modules (7.1 supports this now). We could have hacked this by using a netlist, but you cannot pass parameters/generics into a netlist, which is a feature we require for our pcore. I have used the BFM's for a VHDL module I worked on in the past and I agree that they too were helpful. NNArticle: 82558
"Marc Randolph" <mrand@my-deja.com> schrieb im Newsbeitrag news:1113444520.108760.133130@l41g2000cwc.googlegroups.com... > > Antti Lukats wrote: > > "Stephane" <stephane@nospam.fr> schrieb im Newsbeitrag > > news:d3jggu$kpo$1@ellebore.extra.cea.fr... > > > Antti Lukats wrote: > > > > "Stephane" <stephane@nospam.fr> schrieb im Newsbeitrag > > > > news:d3j43r$e32$1@ellebore.extra.cea.fr... > > > > > > > I don't agree with you: here are the 32 configuration data bits: > > > > > > PAD209 X27Y127 IOB_X1Y127 F14 1 IO_L1P_D31_LC_1 > > > > those are Local Clock, the SelectMAP is 8 bit wide !!!! > > Actually the OP is correct - that IS supposed to be a 32-bit SelectMAP > interface... the ug075.pdf pinout document discusses it briefly. I > don't blame everone for being confused about it though - Xilinx makes > just enough mention of it that you wonder if it might work, but when I > asked my trusty FAE about it a few months ago, he said it is not > supported at this time. wopla! I did see the paramter of bus width on the ICAP V4, but in ALL DOCs the selectmap is defined as 8 bit, that is on ALL DOCs except the pinouts docs! > Also, _LC pins are Low Capacitance pins (can't do LVDS output). Local > clock pins are called _CC (for Clock Capable). Global clocks are > thankfully _GC. ah I was looking at the list of pins that contained _LC and _CC mixture so I messed the two > > > >>so the minimum reconfiguration time for this part should be a > little bit > > > >>more than 7.4/100/32 = 2.3ms > > 7.4/100/8 = 9.25 ms, plus a little at the beginning and end. I'd > budget at least 10ms, maybe a few more. > > Have fun! > > Marc >Article: 82559
<praveen.kantharajapura@gmail.com> schrieb im Newsbeitrag news:1113452931.790055.285850@f14g2000cwb.googlegroups.com... > > Antti Lukats wrote: > > <praveen.kantharajapura@gmail.com> schrieb im Newsbeitrag > > news:47cf10b7.0504130430.9a34497@posting.google.com... > > > Hi all, > > > > > > This is a basic qustion regarding SDA and SCL pins. > > > Since both these pins are bidirectional these should pins need to > be > > > tristated , so that the slave can acknowledge on SDA. > > > > > > But i have seen in some docs that a '1' need to be converted to a > 'Z' > > > while driving on SDA and SCL, what is the reason behind this???? > > > > > > Thanks in advance, > > > Praveen > > > > well in order to drive '1' (EXERNAL RESISTIVE PULLUP) you need to Z > the wire > > eg tristate it. > > 0 is driven as 0 > > 1 is driven (or relased) as Z, ext pullup will pull the wire high > > In order to drive a '1' , i will not tristate it to 'Z' i will drive a > '1' only. > Any issues(Hardware malfunction) if i drive a'1' instead of 'Z' > > > > > Antti YES, the other poster. The master should drive 0 and Z at least on the SDA pin and only in the case that there is no multimastering and no clock stretching it is ok to drive 0 and 1 on the SCL anttiArticle: 82560
Kelly Hall <khall@acm.org> wrote in message news:<QA07e.2350$dT4.172@newssvr13.news.prodigy.com>... > Delbert Cecchi wrote: > > > I was referring to the US Electronic Intelligence or something plane > > that got kidnapped out of international airspace near china and forced > > to land. Got the crew back in a while. As I recall we got the airframe > > back in boxes. It was rumored the crew didn't have enough time to > > destroy all. Probably within last 10 or so years. Google should turn > > it up. EC137 may have been the aircraft type. > > A Chinese F-8 and a US EP-3 collided during an intercept; the F-8 was > lost and the EP-3 performed an emergency landing at Hainan airfield. A > fairly standard cock-up between great powers. > > Kelly the theme for this episode of Jag: http://www.tvtome.com/tvtome/servlet/GuidePageServlet/showid-242/epid-99581/ though the ending is a bit different ;) -LasseArticle: 82561
One thing simulation isn't good at is creating random inputs..for example... I've been working on a telephone port.. and the FPGA simulation is good.. but there are other chips, and they didn't always function as expected. This caused the real FPGA to lock up or and do strange, unexpected things also (accidentally) pins weren't locked down by the original designer so some features were by accident rather than by design. The simulator also won't pick up metastability issues... I had that one byte me too But a successful simulation is a milestone. I've taken a simulation to a working prototype PCB in less than a week.. Mind you .... I've spent the last 2 weeks fixing up "unexpected" glitches.. not to do with the FPGA.. but due to real world timings when the FPGA interacts with the outside world but the board did work exactly as expected. Simon "Ankit Raizada" <ankit.raizada@gmail.com> wrote in message news:1113393997.772874.97950@l41g2000cwc.googlegroups.com... > I am just wondering if i simulate a design given in verilog using a > test fixure in a modern simulator like ModelSim and the outputs are > verified, what are the chances that the design will still not work in > the actual FPGA assuming it fits and Place and Route is successful. > > What are the factors that make this difference and how can i catch them > in the design cycle. > > I am actually creating few designs for DSP algos for my acadmic > project, and being a beginnner in this whole DSP over FPGA I find it > rather difficult to decide wather to call a successful simulation a > milestone in the design cycle or not. > > Please share your experiences and ideas on this >Article: 82562
On Wed, 13 Apr 2005 13:00:07 -0700, Shalin Sheth wrote: Is this some sort of FAQ reply for people who want more speed from a MicroBlaze? I've never used a MicroBlaze or Xilinx (I use Nios II on Altera chips), but it looks like you almost completely missed the OP's point - he is not (yet !) interested in the quality of code generated by the compiler, but is suffering from 24-cycle memory reads on the SDRAM. This is most likely a problem with the SDRAM controller or its setup. Perhaps you are getting a full bank + row + column select for every access, although even then 24 cycles is way too long. I don't know what sort of tools Xilinx has, and how they compare to Altera's SOPC Builder, but when I had trouble with my SDRAM (it took 2 cycles per access instead of 1, during bursts), I tested with a simple setup of a Nios II running from internal FPGA memory, a DMA component (to easily generate burst sequences), and the SDRAM controller. Using the debugger, I manually set the DMA to burst read or write and used SignalTap (ChipScope on Xilinx?) to view what was happening. That way you are simplifying things as much as possible to concentrate on the specific problem. > Vladmir, > > Interesting data point. How much did his performance increase after > enabling caches? > > First, check to make sure that you have compiler optimization enabled. > This does make a hugh difference in optimizing your software code (2-3x > in some instances). I would suggest using the latest EDK 7.1 GNU > compiler here. > > Second, in EDK 7.1 a new MCH_OPB_SDRAM memory controller was released > that connects to the Xilinx CacheLink interface of MicroBlaze v4.0. > This also greatly improves performance when using caches. > > Finally, you may want to use tools like xil_profile to see where the > processor is spending a lot of its time. You may be able to improve the > performance by enabling hardware features such as multiplier, divider or > barrel shifter. > > Cheers, > Shalin- > > v_mirgorodsky@yahoo.com wrote: >> Hi, ALL! >> >> Recently one of my friends faced very strange problem. He had the >> MicroBlaze CPU in his design running with 50MHz clock speed. He also >> had external SDRAM module and his application was executing out of >> external SDRAM memory. During first few benchmark tests he realized >> that it takes about 24 clock cycles to access memory :( This means >> that cool embedded 50MHz MicroBlaze CPU runs slower than poor external >> 8MHz AVR. After my advice he enabled the cache within MicroBlaze, but >> application execution speed did not increased significantly. >> >> As he described later, this was one of hand-on samples from EDK. May be >> the sample is not optimized for performance and very simplified, but >> net performance of 2MHz processor is not even close to advertised by >> Xilinx :( >> >> Could any one give any comment on that? >> >> Regards, >> Vladimir S. Mirgorodsky >>Article: 82563
I'm quite new to FPGA/Verilog and I'm not sure if this is the correct news group to use for this kind of posting - appologies if I've posted to the wrong place. Anyway, I'm having problems using memory within a FSM. I'm currently using Xilinx ISE for a VirtexII. I'm trying to use the RAMB16_S18 memory primitive (SelectRAM). I've written a short test which writes the sequence 0, 1, 2, 3 .... 14, 15 to the RAM in one state, then in another it reads it back. However, I read back the memory as 15, 0, 1, 2 .... 13, 14. I'm assuming that the 15 in the first read-back element is from the previous cycle and hence the whole lot is offset due to a clocking issue. I've truncated the code and copied below: // Buffer clock for the ram wire Raw_Data_Profile_CLK; BUFG Raw_Data_Profile_CLK_Buffer(.I(Main_Clock), .O(Raw_Data_Profile_CLK)); // Setup the RAM reg [9:0] Data_Address; reg [15:0] Data_In; reg [2:0] Data_In_Parity; wire [15:0] Data_Out; wire [2:0] Data_Out_Parity; reg WE; RAMB16_S18 Data_RAM ( .DI( Data_In), .DIP( Data_In_Parity), .ADDR( Data_Address), .EN( 1'b1), .WE( WE), .SSR( 1'b0), .CLK( CLK), .DO( Data_Out), .DOP( Data_Out_Parity) ); // Update the Next State for the finite state machine reg [1:0] Current_State, Next_State; always @(posedge Main_Clock) begin Current_State <= Next_State; end // Implement the Finite State Machine reg [4:0] Test_Counter; always @(posedge Main_Clock) begin case(Current_State) Test_State_1: Begin WE <= 1'b1; // Enable Writes to the Memory Data_Address <= Test_Counter; // Select the address (ie 0, 1, 2, 3 etc) Data_In <= Test_Counter; // Fill the mem location with 0, 1, 2, 3 etc Test_Counter <= Test_Counter + 1; // Increment the counter If(Test_Counter >= 15) Begin Next_State <= Test_State_2; // Goto next state Test_Counter <= 0; End Else Next_State <= Test_State_1; // Stay in this state End Test_State_2: Begin WE <= 1'b0; // Enable reads to the Memory Data_Address <= Test_Counter; // Select the address (ie 0, 1, 2 etc) Output <= Data_Out; // Retrieve the data from the memory Test_Counter <= Test_Counter + 1; // Increment the counter If(Test_Counter >= 15) Begin Next_State <= Test_State_1; // Goto next state Test_Counter <= 0; End Else Next_State <= Test_State_2; // Stay in this state End endArticle: 82564
Hai all, can you plz let me know the different tools used in industry for ASIC SYNTHESIS regards, kishoreArticle: 82565
Dear All, I am trying to establish if I can fit the following functionality into a single FPGA. I have FPGA resource utilisation statistics (from the ISE tool MAP process) for four functional blocks. I also have the statistics on the number of available resources on the XC2VP30 FPGA. My question is can I use the individual MAP reports to accuratly estimate if my four seperate functional blocks will fit in the XC2VP30 FPGA? XC2VP30 Block A Block B Block C Block D Total Slices 13,696 4,248 2,771 848 5,370 13,237 Flip-Flops 27,392 5,056 3,406 95 4,888 13,445 4-Input LUTs 27,392 5,885 3,036 1,581 8,281 18,782 Since each of the four blocks were 'compiled' (MAP reports generated in the ISE tool) individually no slice contains un-related logic. The other FPGA resources (DCMs, GCLKs, PPCs etc ... ) are all under utilised. My understanding is that since the 'Total' number of slices used is less than the number of slices available on the 'XC2VP30' no slice will have to contain un-related logic so my four blocks (Block A - D) will fit inside my XC2VP30 FPGA. Is this correct or have I made some critical assumptions regarding combining functionality within the FPGA and regarding timing aspects? Regards SimonArticle: 82566
Hi Simon, if decvice total is 13.696 and A+B+C+D total is 13.237 then I am 99% positive that combining A+B+C+D in single design ABCD will cause problems. Unless of course large parts of A,B,C or D are optimized away when combined. Hm, I take the 99.8% back, lets say I am 80% sure you will have _some_ sort (possible not related to # of slice resource used) of problems. The ABCD slice useage can be lower (thats why I reduced the 99% sure to 80% sure) than the A+B+C+D as the slice utilization ratio may be better but here it depends how good the design fits and how good the tools really are. the 'unrelated logic' is not what I you think it is, I think. Unrelated means that the tools generated additional logic that was not in the original design, in order to achive performance or routing or any other reason. So its not directly bound to the slice utilization. I am sometimes wrong (usually not). Anyway cases where I am wrong or my guess is totally wrong interest me, so please post some results what happened with ABCD in single design ! antti http://gforge.openchip.org whuuuuuuups! I checked your domain name ;) well my advice (based on your numbers) is that you should take larger FPGA (if the A, B, C, D can not be optimzed to use at least 10% less resources) just to be prepared for in-field design change that may cause the design to not fit any more. "stockton" <simon.stockton@baesystems.com> schrieb im Newsbeitrag news:dbcd481c.0504140211.42f75283@posting.google.com... > Dear All, > > I am trying to establish if I can fit the following functionality into > a single FPGA. > > I have FPGA resource utilisation statistics (from the ISE tool MAP > process) for four functional blocks. I also have the statistics on the > number of available resources on the XC2VP30 FPGA. > > My question is can I use the individual MAP reports to accuratly > estimate if my four seperate functional blocks will fit in the XC2VP30 > FPGA? > > XC2VP30 Block A Block B Block C Block D Total > > Slices 13,696 4,248 2,771 848 5,370 13,237 > Flip-Flops 27,392 5,056 3,406 95 4,888 13,445 > 4-Input LUTs 27,392 5,885 3,036 1,581 8,281 18,782 > > Since each of the four blocks were 'compiled' (MAP reports generated > in the ISE tool) individually no slice contains un-related logic. > > The other FPGA resources (DCMs, GCLKs, PPCs etc ... ) are all under > utilised. > > My understanding is that since the 'Total' number of slices used is > less than the number of slices available on the 'XC2VP30' no slice > will have to contain un-related logic so my four blocks (Block A - D) > will fit inside my XC2VP30 FPGA. > > Is this correct or have I made some critical assumptions regarding > combining functionality within the FPGA and regarding timing aspects? > > Regards > > SimonArticle: 82567
> simulator such as ModelSim SE, ModelSim PE, Synopsys VCS, or Cadence > NC-Sim. > For Modelsim PE you need to buy extra the swift models and in SE you don't need that.Article: 82568
Hi, The SDRAM controller doesn't need 24 clock cycle for a single access. It's more around 12 clock cycles. But it seems that both the instruction and data interface on microblaze is connected to the same memory controller and that no internal memory is used. So for a load instruction to execute, it will require two 12-clock cycles accesses. Store is done a few cycles faster and instruction that doesn't access memory should be 12 clock cycles. Using LMB will reduce instruction fetches to 1 clock cycles and data accesses to 2 clock cycles. That is the same latency as for cache hits. It seems unusual that the usage of caches doesn't improve the performance. It's to my knowledge always a big improvement compared to running from external memories specially SDRAM or DDR. Fast SRAM will have much less latency. In order to get cacheline burst access, the MCH_OPB_SDRAM controller should be used. It will do read and write burstlines both for instruction and data cache misses. Göran Bilski David wrote: > On Wed, 13 Apr 2005 13:00:07 -0700, Shalin Sheth wrote: > > Is this some sort of FAQ reply for people who want more speed from a > MicroBlaze? I've never used a MicroBlaze or Xilinx (I use Nios II on > Altera chips), but it looks like you almost completely missed the OP's > point - he is not (yet !) interested in the quality of code generated by > the compiler, but is suffering from 24-cycle memory reads on the SDRAM. > This is most likely a problem with the SDRAM controller or its setup. > Perhaps you are getting a full bank + row + column select for every > access, although even then 24 cycles is way too long. I don't know what > sort of tools Xilinx has, and how they compare to Altera's SOPC Builder, > but when I had trouble with my SDRAM (it took 2 cycles per access instead > of 1, during bursts), I tested with a simple setup of a Nios II running > from internal FPGA memory, a DMA component (to easily generate burst > sequences), and the SDRAM controller. Using the debugger, I manually set > the DMA to burst read or write and used SignalTap (ChipScope on Xilinx?) > to view what was happening. That way you are simplifying things as much > as possible to concentrate on the specific problem. > > > > > >>Vladmir, >> >>Interesting data point. How much did his performance increase after >>enabling caches? >> >>First, check to make sure that you have compiler optimization enabled. >>This does make a hugh difference in optimizing your software code (2-3x >>in some instances). I would suggest using the latest EDK 7.1 GNU >>compiler here. >> >>Second, in EDK 7.1 a new MCH_OPB_SDRAM memory controller was released >>that connects to the Xilinx CacheLink interface of MicroBlaze v4.0. >>This also greatly improves performance when using caches. >> >>Finally, you may want to use tools like xil_profile to see where the >>processor is spending a lot of its time. You may be able to improve the >>performance by enabling hardware features such as multiplier, divider or >>barrel shifter. >> >>Cheers, >>Shalin- >> > > > > > >>v_mirgorodsky@yahoo.com wrote: >> >>>Hi, ALL! >>> >>>Recently one of my friends faced very strange problem. He had the >>>MicroBlaze CPU in his design running with 50MHz clock speed. He also >>>had external SDRAM module and his application was executing out of >>>external SDRAM memory. During first few benchmark tests he realized >>>that it takes about 24 clock cycles to access memory :( This means >>>that cool embedded 50MHz MicroBlaze CPU runs slower than poor external >>>8MHz AVR. After my advice he enabled the cache within MicroBlaze, but >>>application execution speed did not increased significantly. >>> >>>As he described later, this was one of hand-on samples from EDK. May be >>>the sample is not optimized for performance and very simplified, but >>>net performance of 2MHz processor is not even close to advertised by >>>Xilinx :( >>> >>>Could any one give any comment on that? >>> >>>Regards, >>>Vladimir S. Mirgorodsky >>> > >Article: 82569
lecroy7200@chek.com wrote: >>>>I'm sure others have this problem ... >>>> >>>>Is there a tool that'll let one view and hopefully print a > > schematic > >>>>done in the old Xilinx F2.1i schematic tool? The new stuff > > doesn't > >>>>want to know about the old stuff, and worse is that you can't even >>>>install 2.1i on an XP machine. (Yeah, that'll teach me to > > upgrade.) > >>>>I don't want to do anything with this schematic other than view > > it. > >>>>I'm doing a new board sorta based on an old design, and the new > > design > >>>>will of course be in VHDL rather than as a schematic. >>>> >>> >>>Aldec's tool Active-HDL has capability of importing the Foundation >>>schematics and entire projects. The import utility not only allows >>>printing, but also importing these files into their format, > > maintain and > >>>even convert into an HDL design that can be targeted for any >>>family/device. They show this capability on their website: >>>http://downloads.aldec.com/Previews/Presentations/IP_Core.html >>> >> >>Oops sorry, wrong link: >> > > http://downloads.aldec.com/Previews/Presentations/Active-XE_Edition.html > > I have this same problem. The new Foundation can import back to > version 4. Because of a lawsuit between Aldec and Xilinx, they can not > ship older versions of the Foundation tools. The newest Aldec tools > can't seem to import the 2.1 project. However, I was able to import > the version 2.1 into version 3.1 and then read the whole project with > the new Aldec tools. What a pain. > They have a utility in the Tools menu that helps to convert the old pre-2.5 schematics to a format that is possible to import. As I remember it also allows to edit the schematics in .sch format before import.Article: 82570
Reinier wrote: > Hi, > > I'm looking for a freeware or low cost program do document and > illustrate the signal processing flow in my FPGA design. I'd like to > use building blocks like adders, multipliers, memory, busses etc. What > do you guys use to make some nice looking pictures? I don't want to > spend days learning Corel Draw or something huge like that. > > Thanks, > Reinier Never used, but heard of Dia: http://www.gnome.org/projects/dia/ It is claimed to be a Visio replacement. Its multi-platform and free. Give it a try. EGArticle: 82571
Thanks for your replies. I was never considering 1 regulator per RIO but having never built anything that uses RIO before I'm keen to know what is considered best practice. I was just asking for opinions on the need (or otherwise) for a RIO regulator and a second "everything else on 2.5V" regulator. Looking at the recent replies, there seems to be some confusion. However as the UG seems to imply separate 2.5V regulators, maybe that's the way I should play it. Thanks, Roger "jason.stubbs" <jason.stubbs@gmail.com> wrote in message news:1113425021.561172.85090@z14g2000cwz.googlegroups.com... Extract from "The RocketIOT Transceiver User Guide UG024 (v2.5) December 9, 2004" "PCB Design Requirements (Page 109) To operate properly, the RocketIO transceiver requires a certain level of noise isolation from surrounding noise sources. For this reason, it is required that both dedicated voltage regulators and passive high-frequency filtering be used to power the RocketIO circuitry." If you dont use the RIO's you still have to supply power, but you can use the VCCAUX supply in this case. Hope this helps clarify the situation JasonArticle: 82572
Antti Lukats wrote: > "Marc Randolph" <mrand@my-deja.com> schrieb im Newsbeitrag > news:1113444520.108760.133130@l41g2000cwc.googlegroups.com... > >>Antti Lukats wrote: >> >>>"Stephane" <stephane@nospam.fr> schrieb im Newsbeitrag >>>news:d3jggu$kpo$1@ellebore.extra.cea.fr... >>> >>>>Antti Lukats wrote: >>>> >>>>>"Stephane" <stephane@nospam.fr> schrieb im Newsbeitrag >>>>>news:d3j43r$e32$1@ellebore.extra.cea.fr... >>>>> >>>> >>>>I don't agree with you: here are the 32 configuration data bits: >>>> >>>>PAD209 X27Y127 IOB_X1Y127 F14 1 IO_L1P_D31_LC_1 >>> >>>those are Local Clock, the SelectMAP is 8 bit wide !!!! >> >>Actually the OP is correct - that IS supposed to be a 32-bit SelectMAP >>interface... the ug075.pdf pinout document discusses it briefly. I >>don't blame everone for being confused about it though - Xilinx makes >>just enough mention of it that you wonder if it might work, but when I >>asked my trusty FAE about it a few months ago, he said it is not >>supported at this time. > > > wopla! I did see the paramter of bus width on the ICAP V4, but in ALL DOCs > the selectmap is defined as 8 bit, that is on ALL DOCs except the pinouts > docs! > Thank you guys for your feedback! I am to understand that a 32b selectMap is reserved for future use, when 7.1i will be stable, and xilinx engineers more available... Ok, but how can the internal conf logic detect what is the kind of incoming bistream? As soon as the syncro words? In that case, one can not place any garbage on D[31..8], as they might be badly interpreted! Actually, I was puzzled by this recent xilinx answer: 7.1i ECS - Bus width of pin I and O is incorrect in symbol ICAP_VIRTEX4 Family: Software Product Line: FPGA Implementation Part: ECS Version: Record Number: 20920 Last Modified: 03/23/05 08:27:54 Status: Active Problem Description: Keywords: input, output, icap, 32, 8 Urgency: Standard General Description: In the Xilinx Schematic Editor, the ICAP_VIRTEX4 symbol has an I and O pin with a bus width of 8. The width should be 32. Solution 1: This problem has been fixed in the latest 7.1i Service Pack available at: http://support.xilinx.com/xlnx/xil_sw_updates_home.jsp The first service pack containing the fix is 7.1i Service Pack 1.Article: 82573
As they state in their paper http://cag.csail.mit.edu/warfp2005/submissions/29-suh.pdf "In our initial study, we deploy a monitoring capsule in Dcaches to mon- itor the memory behavior between a CPU and L1 Dcache." It is not possible to monitor signals between the CPU and L1 cache (I or D). Was the monitoring of CPU/L1 inferred by the cache misses seen coming from L1? Even so, a lot of memory behavior is missed when only observing cache misses. Regards, Tony Nju Njoroge wrote: > Anthony Mahar wrote: > >>Nju Njoroge wrote: >> >>Interesting question for the "Monitoring Capsule Design" paper... > > they > >>state they monitor behavior "between the CPU and L1 Dcache." Did > > they > >>explain how they were able to do this, since the PPC405 and L1 are > > part > >>of the same hard core? >> > > You are right--the CPU and the L1 cache are in the same hard core, so > we don't have access to the interface inside the CPU core and the > cache. As I described in my previous post, they placed their monitor at > the interface of the L1 cache port that are usually connected to the > PLB. Thus, instead of connecting their CPU to the PLB bus, they > connected the PPC core to their monitor, which is then connected to the > PLB. > > NN >Article: 82574
Roger, The way I understood it, and therefore implemented it was to use a single linear regulator (LT1963) to power all of the RIO on the FPGA. If the LR is capable of supplying more than one FPGA's RIO circuitry, then that is acceptable. As long as all of the RIO supply pins are individually filtered with ferrite beads (and caps when they are not embedded), this should work. Under no circumstances should a switching regulator be used to power the RIO. Also, do not use the same LR that powers RIO to power the internal logic or IO of the FPGA. As you said in your earlier post, a linear reg for RIO, and a seperate reg for everything else. Regards Jason
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z