Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Did you look at these papers that gives some comparison of power consumption of FPGAs: http://www.sigda.org/Archives/ProceedingArchives/Iccad/Iccad2002/03c_1.pdf http://bwrc.eecs.berkeley.edu/People/Faculty/jan/publications/p113.pdf SumitArticle: 74351
> > General Schvantzkoph <schvantzkoph@yahoo.com> wrote in message news:<pan.2004.10.01.14.16.37.467412@yahoo.com>... > >>Those figures are for pure ASICs, what are the costs for structured ASICs? Here is a viewpoint from Faraday, slightly biased, but the numbers are in the right ball park: http://www.faraday-tech.com/StructuredASIC/benefit.html The number of engineers, tool costs etc are incorrect. For the same complexity design, you typically require the same number of engineers and tools -- the only thing different is that ASIC requires back-end tools (which are very expensive) and the engineers to use these tools. However, for large, complex designs, even FPGAs & structured ASICs can be a pain to achieve timing/performance. Here is a whole set of articles on choosing the right silicon: http://www.eetimes.com/industrychallenges/ and another series of articles on FPGAs versus ASICs: http://www.eedesign.com/silicon/showArticle.jhtml?articleId=47204604&kc=6325 Sorry for inundating everyone with pointers, but the articles at these links are useful for this discussion SumitArticle: 74352
"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message news:10mdgekrsrh1c61@corp.supernews.com... > Well, I drive the SRAM clock with a synthesized higher frequency. Shouldn't > the timing from the output registers of the FPGA, going to the address lines > and control lines of the SRAM, be constrained relative to the higher clock > frequency that is clocking these registers? If you're using IOB registers, the timing would be contained defacto. The timing still ends up needing to be relative to something. The SRAM clock comes from a synthesized higher frequency. Do you have a common reference between the SRAM clock synthesizer and the DCM? Your timing budget ends up being relative to that common reference. If you're using the DCM in frequency synthesis mode, the relationship between the clocks becomes less obvious, particularly to the SRAM. The clock to the SRAM and the data to/from the FPGA either are related or your data and SRAM clock will be random relative to each other.Article: 74353
Hi all, I am trying to implement a DSP algorithm in a FPGA. My algorithm has sine and cosine functions in it. Can somebody comment on implementing sine and cosine functions in VHDL or MATLAB fixed point (using fixed point toolbox). Thanks, SDArticle: 74354
The fact that the clocks are in phase isn't the issue. The issue is that the slow domain doesn't "know" where the fast domain is in its cycle. Suppose I send out a read command on fast clk 1, the data comes back on fast clk 3, and should be transfered to the slow domain on fast clk 4. And suppose this cycle repeats every 3 fast clks. How do I get the slow clock to lign up with fast clk 4, 7, 10, etc. The idea of just using one fast clk, with clock enables on all the slow stuff, is an attractive idea. However I don't have any experience in determining whether this will lead to a timing constraint issues or not. It seems that it will be difficult enough to get the performance from the SRAM interface. How do I loosen the constraints on the slow stuff? All those registers would be on the same clock net. Ray Andraka suggested making a copy of the slow clock in the fast domain, and using that signal as a clock enable, I guess in the fast domain? I'm not sure how this helps. The idea of using 90/270 clocks to get rid of skew issues seems good. I have found XAPP253 which is an DDR SDRAM controller with some of the same issues. Trying to work through that.Article: 74355
H. Peter Anvin wrote: (someone wrote) >>>Doing a sequential algorithm in an FPGA is bound >>>to be much slower than one on a hardwired CPU. (snip) > Think about it... what would make it take only one cycle on an FPGA, > if it takes N cycles on a CPU? That would typically be parallellism > of some sort. > Computing a CRC can actually be done very fast on a CPU by using > tables. I doubt you'd get a win there. In the announcement for the ERSA conference (Engineering Reconfigurable Systems and Architectures) they say: The advances in reconfigurable computing architecture, in algorithm implementation methods, and in automatic mapping methods of algorithms into hardware and processor spaces form together a new paradigm of computing and programming that has often been called `Computing in Space and Time'. On a CPU things happen more or less sequentially. Algorithms are coded fairly dense in terms of transistors used per operation. An add instruction might take 32 bits. In an FPGA, an add instruction is implemented as a 32 bit adder. Efficient algorithms on an FPGA should do many things each clock cycle, and tend to be very different than the algorithms that are efficient on CPUs. As for CRC, the table lookup works very well on CPUs, but there are some very different algorithms that work well for FPGAs, especially for larger word sizes. -- glenArticle: 74356
YES, YES, YES, as someone once said in a movie! Brad, you need to put something like this this in your UCF file:- NET "fast_clock" period = 10ns; NET "slow_enable" TNM=FFS "slow_flipflops"; TIMESPEC TS1000 = FROM : slow_flipflops : TO : slow_flipflops : 30ns; So, in your VHDL or whatever, you've used a net 'slow_enable' to clock enable the FFs that go at a third the speed of the 'fast_clock', I've assumed 'fast_clock' is 100MHz for this example. That's the first line of UCF stuff. The second line associates all the FFs that connect to the destinations of the net 'slow_enable' with the timing group name "slow_flipflops". The third line says that signals between members of timing group "slow_flipflops" have 30ns to get to their destination. Note that the FF that generates net 'slow_enable' is NOT included in this group, so the PAR knows to route this net to meet the 10ns requirement of 'fast_clock'. Gothchas include the tools making copies of the net "slow_enable" if it has a large fanout. If this happens, you need to add extra lines like:- NET "slow_enable_1" TNM=FFS "slow_flipflops"; NET "slow_enable_2" TNM=FFS "slow_flipflops"; You can get the net names from the EDIF file, the floorplanner, wherever. A little time and effort up front will save you from a world of hurt later! Good luck mate, Syms. "Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message news:10mdj2smab7t487@corp.supernews.com... > The idea of just using one fast clk, with clock enables on all the slow > stuff, is an attractive idea. However I don't have any experience in > determining whether this will lead to a timing constraint issues or not. > It seems that it will be difficult enough to get the performance from the > SRAM interface. How do I loosen the constraints on the slow stuff? All > those registers would be on the same clock net. >Article: 74357
Hi, I am trying to find out if there are any restrictions to using both the PPC cores in the Virtex-II Pro device as well as using two XAUI cores (one on top and one on the bottom). Specifically, I am trying to find out clocking restrictions and any restrictions on allowing usage of both the PPC cores and both XAUI cores in a signle device. thanks HuzaifaArticle: 74358
> If you're using IOB registers, the timing would be contained defacto. That is another issue I am working on. The tristate DQs have been properly assigned to the IOBs, probably since they can only really live there. For some reason, however, the XST tool wants to put the address register in slices rather than IOBs. I am a newbie to this stuff and haven't been able to figure out how to get them into the IOBs. One way, I thought, would be to constrain the clock to pad timing, but it doesn't seem to work. > The timing still ends up needing to be relative to something. The SRAM > clock comes from a synthesized higher frequency. Do you have a common > reference between the SRAM clock synthesizer and the DCM? I'm not sure what you are saying. I am driving the SRAM clock from the DCM through an output pad. library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; entity v6 is port( oscin : in std_logic ; sramclk : out std_logic; coreclk : inout std_logic; srama : out std_logic_vector(20 downto 0); -- address sramdqa : inout std_logic_vector(9 downto 1); -- data a sramdqb : inout std_logic_vector(9 downto 1); -- data b srame1 : out std_logic; -- chip enable 1 sramba : out std_logic; -- a data write enable srambb : out std_logic; -- b data write enable sramw : out std_logic; -- write enable tpaup : in std_logic; -- switch tpadown : in std_logic; -- switch fx2fd : out std_logic_vector(15 downto 0); -- here used as a test bnc : out std_logic ); end v6; architecture v6behave of v6 is COMPONENT sramdcm PORT( RST_IN : IN std_logic; CLKIN_IN : IN std_logic; LOCKED_OUT : OUT std_logic; CLKFX_OUT : OUT std_logic; CLKFX180_OUT : OUT std_logic; CLK0_OUT : OUT std_logic); END COMPONENT; type sram_state_type is ( sram_state_write, sram_state_read, sram_state_reset, sram_state_idle ); signal sram_state : sram_state_type; signal sram_state_1 : sram_state_type; signal sram_state_2 : sram_state_type; signal sram_state_3 : sram_state_type; signal sram_state_4 : sram_state_type; signal sram_state_5 : sram_state_type; signal sram_addr_delay_1 : std_logic_vector(20 downto 0); signal sram_addr_delay_2 : std_logic_vector(20 downto 0); signal sram_addr_delay_3 : std_logic_vector(20 downto 0); signal sram_addr_delay_4 : std_logic_vector(20 downto 0); signal sram_w_1 : std_logic; -- mirrors sramw output pin signal sram_w_2 : std_logic; -- mirrors sramw output pin signal sram_w_3 : std_logic; -- mirrors sramw output pin signal sram_w_4 : std_logic; -- mirrors sramw output pin signal sram_input_a : std_logic_vector(9 downto 1); signal sram_input_b : std_logic_vector(9 downto 1); signal sram_dqa_2 : std_logic_vector(9 downto 1); signal sram_dqa_1 : std_logic_vector(9 downto 1); signal sram_dqa_0 : std_logic_vector(9 downto 1); signal sram_dqb_2 : std_logic_vector(9 downto 1); signal sram_dqb_1 : std_logic_vector(9 downto 1); signal sram_dqb_0 : std_logic_vector(9 downto 1); signal sramclk180 : std_logic; signal tpaselect_1: std_logic; signal tpaup_1 : std_logic; signal tpadown_1 : std_logic; signal vga_col : std_logic_vector(9 downto 0); signal vga_row : std_logic_vector(9 downto 0); signal image_col : std_logic_vector(9 downto 0); signal image_row : std_logic_vector(9 downto 0); signal test_a123 : std_logic; signal test_a456 : std_logic; signal test_a789 : std_logic; signal test_b123 : std_logic; signal test_b456 : std_logic; signal test_b789 : std_logic; signal test_a : std_logic; signal test_b : std_logic; begin -- sram dcm set for frequency systhesis Inst_sramdcm: sramdcm PORT MAP( RST_IN => '0', CLKIN_IN => OscIn, LOCKED_OUT => open, CLKFX_OUT => sramclk, CLKFX180_OUT => sramclk180, CLK0_OUT => coreclk ); sram_states_process:process(sramclk180) begin if(sramclk180'event and sramclk180='1') then case sram_state is when sram_state_read => -- branch frm here to all other states if( tpadown_1 = '0' ) then sram_state <= sram_state_reset; elsif( tpaup_1 = '0' ) then sram_state <= sram_state_write; else sram_state <= sram_state_idle; end if; sramw <= '1'; sram_w_1 <= '1'; sram_addr_delay_1( 9 downto 0) <= vga_col; sram_addr_delay_1(19 downto 10) <= vga_row; srama( 9 downto 0) <= vga_col; srama(19 downto 10) <= vga_row; vga_col <= vga_col+1; if( vga_col = "1111111111" ) then vga_row <= vga_row+1; end if; -- vga_col( 9 downto 1) <= vga_col( 8 downto 0 ); -- vga_col(0) <= vga_col(9) xnor vga_col(6); -- if( vga_col = "0000000000" ) then -- vga_row( 9 downto 1) <= vga_row( 8 downto 0 ); -- vga_row(0) <= vga_row(9) xnor vga_row(6); -- end if; when sram_state_write => sram_state <= sram_state_read; sramw <= '0'; sram_w_1 <= '0'; ------------------------------------------------------------ -- match these two lines with those in the sram_test:process sram_dqa_0 <= image_col( 9 downto 1 ); sram_dqb_0 <= image_row( 9 downto 1 ); ------------------------------------------------------------ srama( 9 downto 0) <= image_col; srama(19 downto 10) <= image_row; image_col <= image_col+1; if( image_col = "1111111111" ) then image_row <= image_row+1; end if; -- image_col( 9 downto 1) <= image_col( 8 downto 0 ); -- image_col(0) <= image_col(9) xnor image_col(6); -- if( image_col = "0000000000" ) then -- image_row( 9 downto 1) <= image_row( 8 downto 0 ); -- image_row(0) <= image_row(9) xnor image_row(6); -- end if; when sram_state_reset => sram_state <= sram_state_read; sramw <= '0'; sram_w_1 <= '0'; sram_dqa_0 <= "000000000"; sram_dqb_0 <= "000000000"; srama( 9 downto 0) <= image_col; srama(19 downto 10) <= image_row; image_col <= image_col+1; if( image_col = "1111111111" ) then image_row <= image_row+1; end if; -- image_col( 9 downto 1) <= image_col( 8 downto 0 ); -- image_col(0) <= image_col(9) xnor image_col(6); -- if( image_col = "0000000000" ) then -- image_row( 9 downto 1) <= image_row( 8 downto 0 ); -- image_row(0) <= image_row(9) xnor image_row(6); -- end if; when sram_state_idle => sram_state <= sram_state_read; sramw <= '1'; sram_w_1 <= '1'; sram_dqa_0 <= "000000000"; -- "111111111" better than "000000000" at 120MHz? sram_dqb_0 <= "000000000"; -- "000000000" better than "111111111", why? --adding this srama lines seems to mess up the test, why? -- putting this line reduces speed by about 5MHz -- srama <= "000000000000000000000"; -- putting these two lines reduced speed by 11MHz -- image_col <= "1111111111"; -- image_row <= "1111111111"; when others => sram_state <= sram_state_read; end case; end if; end process; srama(20) <= '0'; -- SRAMW WRITE READ WRITE READ WRITE READ -- sram_w_1 ......... 0 1 0 1 0 1 -- sram_w_2 ......... ....... 0 1 0 1 0 -- sram_w_3 ......... ....... ....... 0 1 0 1 -- SRAMA ............ RANDOM1 LINEAR1 RANDOM2 LINEAR2 RANDOM3 LINEAR3 -- sram_dqx_0 ....... RANDOM1 ....... -- sram_dqx_1 ....... ....... RANDOM1 ....... -- sram_dqx_2 ....... ....... ....... RANDOM1 -- DQA_OUT ....... ....... RANDOM1 -- DQA_IN ........... ....... ....... ....... LINEAR1 -- sram_input_a ..... ....... ....... ....... ....... LINEAR1 -- sram_addr_delay_1 ....... LINEAR1 -- sram_addr_delay_2 ....... ....... LINEAR1 -- sram_addr_delay_3 ....... ....... ....... LINEAR1 -- sram_addr_delay_4 ....... ....... ....... ....... LINEAR1 -- bnc .............. ....... ....... ....... ....... ....... RESULT sram_w_delay:process (sramclk180) begin if(sramclk180'event and sramclk180='1') then sram_w_2 <= sram_w_1; sram_w_3 <= sram_w_2; sram_w_4 <= sram_w_3; end if; end process; sram_state_delay:process (sramclk180) begin if(sramclk180'event and sramclk180='1') then sram_state_1 <= sram_state; sram_state_2 <= sram_state_1; sram_state_3 <= sram_state_2; sram_state_4 <= sram_state_3; sram_state_5 <= sram_state_4; end if; end process; sram_read_data_pins_process:process(sramclk180) begin if(sramclk180'event and sramclk180='1') then if(sram_state_2 = sram_state_read ) then sram_input_a <= sramdqa; sram_input_b <= sramdqb; end if; end if; end process; sram_outgoing_data_delay:process (sramclk180) begin if(sramclk180'event and sramclk180='1') then sram_dqa_1 <= sram_dqa_0; sram_dqa_2 <= sram_dqa_1; sram_dqb_1 <= sram_dqb_0; sram_dqb_2 <= sram_dqb_1; end if; end process; sramdqa <= sram_dqa_2 when (sram_w_3 = '0') else (others=>'Z'); -- when ( (sram_state_2 = sram_state_write) -- or (sram_state_2 = sram_state_reset) ) sramdqb <= sram_dqb_2 when (sram_w_3 = '0') else (others=>'Z'); -- when ( (sram_state_2 = sram_state_write) -- or (sram_state_2 = sram_state_reset) ) srame1 <= '0'; sramba <= '0'; srambb <= '0'; sram_addr_delay:process (sramclk180) begin if(sramclk180'event and sramclk180='1') then sram_addr_delay_2 <= sram_addr_delay_1; sram_addr_delay_3 <= sram_addr_delay_2; sram_addr_delay_4 <= sram_addr_delay_3; end if; end process; sram_test:process (sramclk180) begin if(sramclk180'event and sramclk180='1') then if( sram_state_3 = sram_state_read ) then test_a123 <=(sram_input_a(1) xnor sram_addr_delay_4(1)) and (sram_input_a(2) xnor sram_addr_delay_4(2)) and (sram_input_a(3) xnor sram_addr_delay_4(3)) ; test_a456 <=(sram_input_a(4) xnor sram_addr_delay_4(4)) and (sram_input_a(5) xnor sram_addr_delay_4(5)) and (sram_input_a(6) xnor sram_addr_delay_4(6)) ; test_a789 <=(sram_input_a(7) xnor sram_addr_delay_4(7)) and (sram_input_a(8) xnor sram_addr_delay_4(8)) and (sram_input_a(9) xnor sram_addr_delay_4(9)) ; test_b123 <=(sram_input_b(1) xnor sram_addr_delay_4(11)) and (sram_input_b(2) xnor sram_addr_delay_4(12)) and (sram_input_b(3) xnor sram_addr_delay_4(13)) ; test_b456 <=(sram_input_b(4) xnor sram_addr_delay_4(14)) and (sram_input_b(5) xnor sram_addr_delay_4(15)) and (sram_input_b(6) xnor sram_addr_delay_4(16)) ; test_b789 <=(sram_input_b(7) xnor sram_addr_delay_4(17)) and (sram_input_b(8) xnor sram_addr_delay_4(18)) and (sram_input_b(9) xnor sram_addr_delay_4(19)) ; end if; if( sram_state_4 = sram_state_read ) then test_a <= test_a123 and test_a456 and test_a789; test_b <= test_b123 and test_b456 and test_b789; end if; if( sram_state_5 = sram_state_read ) then bnc <= test_a and test_b; end if; end if; end process; fx2_out:process (sramclk180) begin if(sramclk180'event and sramclk180='1') then fx2fd(8 downto 0) <= sram_input_a; fx2fd(15 downto 9) <= sram_addr_delay_4( 20 downto 14) ; end if; end process; tpa_sync:process (sramclk180) begin if(sramclk180'event and sramclk180='1') then tpadown_1 <= tpadown; tpaup_1 <= tpaup; end if; end process; end v6behave;Article: 74359
Robert Sefton wrote: > "Brannon King" <bking@starbridgesystems.com> wrote in message > news:ci7ot5$e7q@dispatch.concentric.net... > >>Once upon a time I saw a number that told me what percentage of my project >>was unconstrained. I can't seem to get TRCE (Xilinx ISE) to show me that >>number again. Can someone point me in the right direction? >> > > trce used to report % of paths covered at the bottom of the .twr report, but > that seems to have disappeared between trce v5.x to v6.x. It was a > misleading number anyway becasue it included paths that could not be > constrained, so 100% was not possible. Maybe that's why they quit reporting > it. But, as Philip said, the trce -u option will report your unconstrained > paths. > > Your guess is correct. We stopped reporting the percentage of paths covered because it was inaccurate and making the number 100% accurate turns out to be a very run time inefficient operation. We're still looking for an acceptable run time solution... Russ PannetonArticle: 74360
Where is the demo software for this wonderful Spartan 3 kit? I was most surprised when I plugged in a VGA monitor and saw all the switches, buttons, and jumper dispalyed in multicolors. Does the keyboard input do something as well?Article: 74361
Markus Fuchs wrote: > > Dear experts, > > We have a PIC MCU, an Altera Flex10k10 (EPF10K10ATC100-3) and a serial > EEPROM connected to a I2C bus. > The PIC and EEPROM are 5V devices, the FPGA is a 3,3V device with MultiVolt > IOs. There is a I2C slave controller implemented in the FPGA that is > definitely working correct. > Our problem is that we are experiencing difficulties when writing from the > PIC to the FPGA over I2C. > The FPGA doesn't read the correct bytes (it reads 0xFFs). It didn't help to > change the pull-ups from 400R to 1K, 4K7, 10K! > Now, we realized that communication works if we connect 5,6V to the pull-ups > instead 4,7V as we used before. > Can somebody please explain why this solved our problem? > Currently it is more a workaround than a solution since we don't know what > exactly our problem is. I believe when you say "4,7V" you mean what we call 4.7 volts in the US, no? If so changing it to 5.6 volts should not make a difference if your power supplies are connected correctly. The FPGA should be using a threshold around 1.4 volts give or take 0.6 volts. So even 3.3 volts should work for all devices in the chain. Have you taken any voltage measurements on the I2C signals? What values do you get with the two different pull up voltage? -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 74362
"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message news:10mdn6aemb3n325@corp.supernews.com... > > The timing still ends up needing to be relative to something. The SRAM > > clock comes from a synthesized higher frequency. Do you have a common > > reference between the SRAM clock synthesizer and the DCM? > > I'm not sure what you are saying. I am driving the SRAM clock from the DCM > through an output pad. First, I'd see if I could drive the SRAM clock from a *register* driven by the DCM allowing matched delays for the clock and data. The phasing might make that unreasonable but it's a suggestion. The clock that serves as the reference for the DCM has a clk-to-out for the SRAM clock relative to the reference clock. The clock that drives the data has a clock-to-out for that data relative to the reference clock. These reference clocks to output are where the constraints would be. As for your VHDL, my apologies: I'm a Verilog guy and parsing through VHDL would give me a headache. I'm hoping someone else can help you find how to get your registers into the IOBs.Article: 74363
Hmm. I didn't think of tying the timing nets together with the enable. That makes a lot of sense. Thanks.Article: 74364
Markus Fuchs wrote: > Dear experts, > > We have a PIC MCU, an Altera Flex10k10 (EPF10K10ATC100-3) and a serial > EEPROM connected to a I2C bus. > The PIC and EEPROM are 5V devices, the FPGA is a 3,3V device with MultiVolt > IOs. There is a I2C slave controller implemented in the FPGA that is > definitely working correct. > Our problem is that we are experiencing difficulties when writing from the > PIC to the FPGA over I2C. > The FPGA doesn't read the correct bytes (it reads 0xFFs). It didn't help to > change the pull-ups from 400R to 1K, 4K7, 10K! > Now, we realized that communication works if we connect 5,6V to the pull-ups > instead 4,7V as we used before. > Can somebody please explain why this solved our problem? > Currently it is more a workaround than a solution since we don't know what > exactly our problem is. If the PIC is always the master, you could try making the SCL line always CMOS drive : In single master, only SDA needs to be open drain. That will ensure faster clock edges - what you describe is a little counter-intuitive, esp as you say the resistor change had no effect, but sounds closest to edge-effects. FPGAs tend to be slow edge intolerant. -jgArticle: 74365
Well, thanks for all your help. I am new to VHDL and totally unequipped in Verilog. I just found the problem with the IOB's. Seems as both the Synthesize tool and the Map tool, both have a pack register command. Turning these both on seems to have packed all the IOBs. "John_H" <johnhandwork@mail.com> wrote in message news:zvB9d.31$Be6.2853@news-west.eli.net... > "Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message > news:10mdn6aemb3n325@corp.supernews.com... > > > The timing still ends up needing to be relative to something. The SRAM > > > clock comes from a synthesized higher frequency. Do you have a common > > > reference between the SRAM clock synthesizer and the DCM? > > > > I'm not sure what you are saying. I am driving the SRAM clock from the > DCM > > through an output pad. > > First, I'd see if I could drive the SRAM clock from a *register* driven by > the DCM allowing matched delays for the clock and data. The phasing might > make that unreasonable but it's a suggestion. > > The clock that serves as the reference for the DCM has a clk-to-out for the > SRAM clock relative to the reference clock. > The clock that drives the data has a clock-to-out for that data relative to > the reference clock. > > These reference clocks to output are where the constraints would be. > > As for your VHDL, my apologies: I'm a Verilog guy and parsing through VHDL > would give me a headache. I'm hoping someone else can help you find how to > get your registers into the IOBs. > >Article: 74366
"Daniel" <df@yahoo.com> wrote in message news:<cjeo3u$28o9$1@biggoron.nerim.net>... > Hi, > > I'm using OCR processing (optical character recognition) on a 3.2 Ghz PCU. > It is not fast enough (10 minutes for each single newspaper page). Do you > think that a FPGA solution could increase the speed of the processing? Must > the software be programmed in order to be used with FPGA's or do solutions > exist regardless of the soft? > > Thanks for your information. > > Daniel > Paris I believe this could be sped up by the use of an FPGA coprocessor. Your OCR s/w would definitely have to be recoded to run on the FPGA, but would provide an excellent starting point. In order to achieve a significant increase (order of magnitude, or better) you have to do one of two things (or both): 1. Parallelize the existing code. 2. Pipeline the existing code. I would suggest profiling your code to find out which part(s) are taking 90% of the execution time. Its probably a small amount. This small part of the code can be converted to FPGA for acceleration. By way of example, I am examining some atmospheric modelling code (written in Fortran). This code has tens of thousands of lines of code, but the part that is taking the bulk of the time is less than 100 lines. Accelerating this part could realize a big performance boost. Besides VHDL and Verilog, you have the option of writing your FPGA code in C or Java. Feel free to contact me at work for more info (t h o m a s DOT s e i m AT p n l DOT gov). TomArticle: 74367
Marc, Care to relate your experiences of the internal termination? I've had no trouble at 600Mbit/s data and a 600MHz clock. Some of my colleagues have mentioned that they've run into problems, and they've reverted to fitting external 100R resistors. Thanks, Syms. "Marc Randolph" <mrand@my-deja.com> wrote in message news:15881dde.0410080333.34287cb4@posting.google.com... > Just be aware that the internal termination can be > less than ideal. > > Have fun,Article: 74368
Try this when you compile: vlog +incdir+<directory> file.v PeteArticle: 74369
Martin Schoeberl wrote: >> >> The PLL will lock AFTER configuration, so after configuration (or a > circuit >> break or so), lock will be deasserted. Also note that the lock signal > is >> direcly derived from the phase comparator, so while the PLL is locking, >> you'll see the lock pin be asserted and deasserted a few times before > it >> becomes a stable '1'. >> >> The best solution is to have a counter that counts up to some value as > long >> as lock is '1', and gets reset when it becomes '0'. Once it reaches the >> count value (say 31) it should stop counting and tell the logic that > the >> clock is stable, which I think is easiest to achieve by ANDing this > signal >> with the (synchronous) reset signal for the circuit. >> > > Does that mean we have to do this in every design that uses a PLL? Can > the FPGA be driven (and therefore violation setup times) with a too fast > clock during PLL startup ? > I didn't care about the PLL lock till now, just inserted it in the clock > path. It's something I recommend for stability, it's not a must as long as your peripherals are not terribly timing-sensitive. After configuration, the PLL already has a few hundred config clock cycles to stabilize and will be in fairly close range of the selected target frequency (I don't have any scientifically significant data on this). That's why checking for a stable lock signal only needs checking for 31 clock cycles (and that's probably pessimistic too - but then again, you can never be too pessimistic). However, if your design contains some well-known softcore starting with J or N that needs to access (DDR-)SDRAM or some other timing-sensitive piece of memory in the first few cycles after reset I would definitely recommend waiting for the lock pin to stabilize for a few clock cycles before taking the core out of reset. Then again, I still write my designs using the stable VHDL subset available in ViewSynthesis released 1992, so I may just as well be an overly-defensive designer ;-) Best regards, BenArticle: 74370
Ben, >> Does that mean we have to do this in every design that uses a PLL? Can >> the FPGA be driven (and therefore violation setup times) with a too >> fast >> clock during PLL startup ? >> I didn't care about the PLL lock till now, just inserted it in the >> clock >> path. > > It's something I recommend for stability, it's not a must as long as > your > peripherals are not terribly timing-sensitive. I'm not afraid about peripheral timing. I was thinking about unreliable (unexpected) behavior inside the FPGA. > After configuration, the PLL already has a few hundred config clock > cycles > to stabilize and will be in fairly close range of the selected target > frequency (I don't have any scientifically significant data on this). How can this happen as the configuration of the PLL is part of the configuration process? Is the configuration data of the PLL in the first bits? > That's why checking for a stable lock signal only needs checking for 31 > clock cycles (and that's probably pessimistic too - but then again, you > can > never be too pessimistic). How do you get the 31 clock cycles? Is this specified? > > However, if your design contains some well-known softcore starting with > J or > N that needs to access (DDR-)SDRAM or some other timing-sensitive piece > of What is the well known softcore with J? I only know M, N... > memory in the first few cycles after reset I would definitely recommend > waiting for the lock pin to stabilize for a few clock cycles before > taking > the core out of reset. That should be easy, since booting of the softcore takes some time. Regards, MartinArticle: 74371
Martin Schoeberl wrote: <snip> >>However, if your design contains some well-known softcore starting with >>J or N that needs to access (DDR-)SDRAM or some other timing-sensitive piece >>of > > > What is the well known softcore with J? I only know M, N... He might mean Jop ? :) -jgArticle: 74372
Hi Martin, >>> Does that mean we have to do this in every design that uses a PLL? Can >>> the FPGA be driven (and therefore violation setup times) with a too >>> fast >>> clock during PLL startup ? >>> I didn't care about the PLL lock till now, just inserted it in the >>> clock >>> path. >> >> It's something I recommend for stability, it's not a must as long as >> your >> peripherals are not terribly timing-sensitive. > > I'm not afraid about peripheral timing. I was thinking about unreliable > (unexpected) behavior inside the FPGA. As I stated in the line below: >> After configuration, the PLL already has a few hundred config clock >> cycles >> to stabilize and will be in fairly close range of the selected target >> frequency (I don't have any scientifically significant data on this). I haven't seen any problems on Cyclone/Stratix devices so far, except for those booting up a NIOS that had their boot code copied into S(D)RAM and using a PLL. This is probably due to Altera's pessimistic timing models. That's as much as I can say... > How can this happen as the configuration of the PLL is part of the > configuration process? Is the configuration data of the PLL in the first > bits? No, it's interspersed within the config data, especially with the larger devices. However, after clocking in the full bitstream, the device uses something like 140 config clock cycles before becoming operational (details depend on the device - see the appropriate handbook), and at some (fairly early) time during this interval the PLLs start locking. No details known. I could maybe get at the details, but it would severely deplete my stock of Brownie Points because of bugging the Silicon Guys for no critical reason. >> That's why checking for a stable lock signal only needs checking for 31 >> clock cycles (and that's probably pessimistic too - but then again, you >> can never be too pessimistic). > > How do you get the 31 clock cycles? Is this specified? Nope. Just something that seems to work well at 64MHz. 15 seems to work pretty well too, I've heard, but as I said, you can never be too pessimistic, and it saves only half a microsecond after FPGA bootup. A Pentium takes about a millisecond to initialize, so I wouldn't worry too much ;-) >> However, if your design contains some well-known softcore starting with >> J or N that needs to access (DDR-)SDRAM or some other timing-sensitive >> piece of > > What is the well known softcore with J? I only know M, N... Does JOP ring a bell? ;-) >> memory in the first few cycles after reset I would definitely recommend >> waiting for the lock pin to stabilize for a few clock cycles before >> taking the core out of reset. > > That should be easy, since booting of the softcore takes some time. Yep. I also continually advertize not using a fully asynchronous reset, but one that gets released at some clock edge (i.e. one or two FFs between the reset signal and the actual logic) so that (1) it gets a constrainable Tsu value and (2) gets shielded from metastability. Best regards, BenArticle: 74373
>> What is the well known softcore with J? I only know M, N... > > He might mean Jop ? :) :-)))))))) MartinArticle: 74374
> Yep. I also continually advertize not using a fully asynchronous reset, > but > one that gets released at some clock edge (i.e. one or two FFs between > the > reset signal and the actual logic) so that (1) it gets a constrainable > Tsu > value and (2) gets shielded from metastability. > I thought this is already common sense. And with an FPGA you need no external reset at all. To be on the safe side (as you suggested) take a small counter till you release the reset signal. Martin
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z