Site Home   Archive Home   FAQ Home   How to search the Archive   How to Navigate the Archive   
Compare FPGA features and resources   

Threads starting:
1994JulAugSepOctNovDec1994
1995JanFebMarAprMayJunJulAugSepOctNovDec1995
1996JanFebMarAprMayJunJulAugSepOctNovDec1996
1997JanFebMarAprMayJunJulAugSepOctNovDec1997
1998JanFebMarAprMayJunJulAugSepOctNovDec1998
1999JanFebMarAprMayJunJulAugSepOctNovDec1999
2000JanFebMarAprMayJunJulAugSepOctNovDec2000
2001JanFebMarAprMayJunJulAugSepOctNovDec2001
2002JanFebMarAprMayJunJulAugSepOctNovDec2002
2003JanFebMarAprMayJunJulAugSepOctNovDec2003
2004JanFebMarAprMayJunJulAugSepOctNovDec2004
2005JanFebMarAprMayJunJulAugSepOctNovDec2005
2006JanFebMarAprMayJunJulAugSepOctNovDec2006
2007JanFebMarAprMayJunJulAugSepOctNovDec2007
2008JanFebMarAprMayJunJulAugSepOctNovDec2008
2009JanFebMarAprMayJunJulAugSepOctNovDec2009
2010JanFebMarAprMayJunJulAugSepOctNovDec2010
2011JanFebMarAprMayJunJulAugSepOctNovDec2011
2012JanFebMarAprMayJunJulAugSepOctNovDec2012
2013JanFebMarAprMayJunJulAugSepOctNovDec2013
2014JanFebMarAprMayJunJulAugSepOctNovDec2014
2015JanFebMarAprMayJunJulAugSepOctNovDec2015
2016JanFebMarAprMayJunJulAugSepOctNovDec2016
2017JanFebMarAprMayJunJulAugSepOctNovDec2017
2018JanFebMarAprMayJunJulAugSepOctNovDec2018
2019JanFebMarAprMayJunJulAugSepOctNovDec2019
2020JanFebMarAprMay2020

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search

Messages from 56125

Article: 56125
Subject: Re: JTAG madness
From: Fred Viles <fv+abuse@nospam.epitools.com>
Date: Thu, 29 May 2003 04:24:31 GMT
Links: << >>  << T >>  << A >>
rickman <spamgoeshere4@yahoo.com> wrote in
news:3ED3A6C7.229A5223@yahoo.com: 

>...
> The second will have an ARM MCU and a large CPLD.
>...

There are JTAG probes available for ARM, at least, that have no
problem with there being mulitple TAPs on the JTAG chain. 
(MultiICE from ARM and MAJIC from EPI, just to name two).  Unless
your CPLD programmer can't handle it, you shouldn't need to
provide for isolating the TAPs in this section. 

> I figure that a 1K resistor driving a 10 pF load will create a
> 10 nS (roughly) rise time.  That should not create a problem
> with the 10 MHz clocks typically found on JTAG.

FWIW, MAJIC runs up to 40 MHz TCLK.  But it's configurable.

- Fred (works for EPI)

Article: 56126
Subject: Re: 20 to 5 encoder optimization?
From: Gil Herbeck <gil@radix20.com>
Date: Thu, 29 May 2003 05:06:02 GMT
Links: << >>  << T >>  << A >>
Actually, the number of possible encodings is much greater than 2^32.
It is ...
     32! / 11!

Binary encode, leading zero count, trailing zero count, they are all
roughly the same circuit really - a log2 structure.  You'll find them
built into common synthesizers, but I think they are optimized for
performance.  You can probably figure out a ripple structure that
would save some area.

Cheers,
Gil


Muzaffer Kal wrote:

> Hi,
> I need to implement 20 (one hot) to 5 encoder. The result would cover
> 21 codes of the 32 available (one for no input active case). I am
> hoping that I can do better than binary encoding. The question is how
> do I find an optimal encoding where any unique assignment of the 21
> codes with the remaining codes assigned in a don't care fashion would
> be acceptable. The optimality criteria can be minimum area (say
> minimum number of 2 input gates). One way woul be to write a script
> which generates Verilog rtl for all 2^32 possible assignments and run
> the synthesizer for a long time trying all of them. Any other good
> ways ?
> 
> Muzaffer Kal
> 
> http://www.dspia.com
> ASIC/FPGA design/verification consulting specializing in DSP algorithm implementations
> 


Article: 56127
Subject: Re: need help on sending 500Mbit/s data through 100 feet of cable, Giga-Ethernet?
From: hmurray@suespammers.org (Hal Murray)
Date: Thu, 29 May 2003 06:35:24 -0000
Links: << >>  << T >>  << A >>
>1) how difficult to create a custom-built interface in Xilinx Virtex
>II to GMII or TBI, just to transfer data? any pointer, link , that I
>could read about the GMII? I try google, but most of them is talking
>about interfacing with the MAC. not much on the protocol of GMII.

GMII is pretty simple.  Can you get the PHY chips?

The transmit side is clock, 8 bits of data, and a data-valid bit.
You can think of data-valid as carrier.  The clock goes in the same
direction as the data.

The receive side is the same thing in the other direction - the
PHY chip provides the clock.

You can take two gizmos that talk GMII and wire them directly
to eachother.  Or a full-duplex gizmo can talk to itself
through a loop-back connector.

That's for gigabit.  It gets more complicated if you want 10/100
too, but you can ignore that if you are willing to run at only 1000.

-- 
The suespammers.org mail server is located in California.  So are all my
other mailboxes.  Please do not send unsolicited bulk e-mail or unsolicited
commercial e-mail to my suespammers.org address or any of my other addresses.
These are my opinions, not necessarily my employer's.  I hate spam.


Article: 56128
Subject: Re: JTAG madness
From: Magnus Homann <d0asta@mis.dtek.chalmers.se>
Date: 29 May 2003 08:39:27 +0200
Links: << >>  << T >>  << A >>
rickman <spamgoeshere4@yahoo.com> writes:

> I like most of what you said, but I don't think you can *debug* the
> board with all the TMS and TCK signals in parallel.  This would send the
> same instructions to all devices and put them in a possibly poor state
> for normal operation when you only intended to be controlling part P.  I
> expect to have to use jumpers to disable the TRST, TMS or TCK inputs on
> all parts other than P.  

Blame someone else! :-) Seriously, I think now that we made a similar
arrangement on TMS, so that we had two separate TMS signals. If TMS =
'1', the TAP should be in reset, regardless of TDI and TCK clocking.

TRST? That's optional, and I wouldn't recommend connecting all parts
in parallel anyway for that signal.

Homann
-- 
Magnus Homann, M.Sc. CS & E
d0asta@dtek.chalmers.se

Article: 56129
Subject: Re: 20 to 5 encoder optimization?
From: hmurray@suespammers.org (Hal Murray)
Date: Thu, 29 May 2003 07:05:23 -0000
Links: << >>  << T >>  << A >>
>I need to implement 20 (one hot) to 5 encoder. The result would cover
>21 codes of the 32 available (one for no input active case). I am
>hoping that I can do better than binary encoding.  ...

I'm not a wizard at this stuff...

The reason that people use one-hot encodings is that it is
simple to do things like this.  Write down a table that maps
each one-hot state into a 5 bit number.  You need a giant
OR gate for each output bit - a column of that table.
There will be one input term to an OR gate for each 1 bit
in the table.

You can save logic by picking the codes to use in the
table that have fewer bits in them.  For example, if
you start by using codes 1 through 20, 00001 binary
through 10100, you can trade state 15 (01111) for
21 (10101) and save 1 gate term.  You can save one
more by using state 24.  I think that's it, but I'm
not sure.  (That's assuming your application will let
you pick any encoding you want rather than requiring
them to be compact.)

You have to decide what to do about illegal states - the
ones that have 2 or more bits on.  The gates above only
work correctly if the input is a legal one-hot state.

You can catch the particular state of all zeros easily
if you leave the 0 code for that case.  If you don't need
to catch that case, you can save 3 more OR terms by using
the 0 code to represent one of the valid states.

-- 
The suespammers.org mail server is located in California.  So are all my
other mailboxes.  Please do not send unsolicited bulk e-mail or unsolicited
commercial e-mail to my suespammers.org address or any of my other addresses.
These are my opinions, not necessarily my employer's.  I hate spam.


Article: 56130
Subject: Re: need help on sending 500Mbit/s data through 100 feet of cable, Giga-Ethernet?
From: Falser Klaus <kfalser@IHATESPAMdurst.it>
Date: Thu, 29 May 2003 10:23:52 +0200
Links: << >>  << T >>  << A >>
In article <b34a8c79.0305281549.5f3a9ff8@posting.google.com>, ospyng@yahoo.com says...
> hi all,
> need some help on this,
> 
> need to send about 500Mbit/s of data(images) between our equipment 100
> ft (30 meter ) apart. 

<snip>

A very simple solution is to use a 8B/10B serializer (internal of the FPGA or external) 
and a tranceiver for otical fibre. 
Look at Agilent, HFBR tranceiver modules.

You will get a channel, where on the one side bytes go in and on the other side bytes drop 
out. 
You dont even need a protocol. 8B/10B conversion gives you the possibilty to send about 10 
different "command bytes", so you can simply signal the begin or end of a data packet,
the request of another packet, etc..

Hope this helps     
 
Klaus Falser
Durst Phototechnik AG
kfalser@IHATESPAMdurst.it

Article: 56131
Subject: Re: JTAG madness
From: "Arie de Muynck" <Sorry_I_hate_spam@nomail.com>
Date: Thu, 29 May 2003 10:30:10 +0200
Links: << >>  << T >>  << A >>

"rickman" <spamgoeshere4@yahoo.com> wrote in message
news:3ED539F9.37F04F9A@yahoo.com...
> Magnus Homann wrote:
> >
> > rickman <spamgoeshere4@yahoo.com> writes:
> >
> > > I am finding JTAG to be a major hassle to try to use for both debug
and
> > > production boundary scan.  Seems there are conflicting requirements
> > > which the two camps are not generally interested in dealing with.
> >
> > RISCTrace/Watch from IBM used to have the same issues. We had the
> > PowerPC in the middle of the chain.
> >
> > If I remember correctly we (i.e. my colleague) solved this by:
> >
> > 1) connecting all TMS and TCK in parallel.
> >
> > 2) rotuing TDI/O from part A to connector X to part P and split to
> >    connector X and part B (see fig).
> >
> > 3) When debugging, connector was used so that only part P was in the
> >    chain. Pullups on TDI for other parts kept them out of trouble.
> >
> > 4) When tetsing for production, connector was used so that the entire
> >    chain was used (strap in X to connect TDo of A with TDI on P).
> >
> > 5) The last part could also be achieved with resistor.
> >
> > The issue with long stubs on TDI/O we didn't think mattered,
> > considering it's a synchronous system with a failry low system clock.
TCK on the other hand was routed as a long clock winding to all parts and
ending in a Thevenin termination.
> >
> > All this from memory, if it doesn't work, blame someone else.
> >
> > (fig 1)
> >
> >   |--------|         |--------|     |--------|
> > --| part A |--|   |--| part P |-----| part B |---
> >   |--------|  |   |  |--------|  |  |--------|
> >               |   |              |
> >               |   |              |
> >             |----------------------|
> >             | connector X          |
> >             |----------------------|
>
> I like most of what you said, but I don't think you can *debug* the
> board with all the TMS and TCK signals in parallel.  This would send the
> same instructions to all devices and put them in a possibly poor state
> for normal operation when you only intended to be controlling part P.  I
> expect to have to use jumpers to disable the TRST, TMS or TCK inputs on
> all parts other than P.


It is only needed to disable and pullup the TMS input: that will clock '1'
bits into TMS bringing a part into JTAG reset (causing normal operation of
the part).
It is a good habit to have at least a pullup on TMS: if not driven by JTAG
and spikes occur on TCK at least the parts stay in normal operation.

Example: ARM core first in chain, the rest behind it, TDO from ARM routed to
a select jumper (full chain or ARM only TDO to JTAG) and a jumper that
disconnects to isolate the rest of the chain from TMS

                +--------------------o A
                |                    o----------- TDO
TDI ---  ARM  --+--  ...REST...  ----o B
          |              |
TMS ------+---o o--------+-- pullup
               B

Close A for JTAG debugging of the ARM core, B for full chain JTAG boudary
scan or CPLD programming etc.

The use of a separate TRST with an ARM core is mainly to allow RESET to keep
the core in reset until the breakpoint registers have been set to break the
core after RESET is released.
Some ARM cores don't handle an instruction "0xF......." very well, they
crash beyound JTAG reset. And unfortunately this is the content of an
unprogrammed Flash...

Regards,
Arie de Muynck






Article: 56132
Subject: Re: JTAG madness
From: Martin Thompson <martin.j.thompson@trw.com>
Date: 29 May 2003 10:09:30 +0100
Links: << >>  << T >>  << A >>
"Brett Foster" <custserv@forums.ws> writes:

> "Mike Rosing" <rosing@neurophys.wisc.edu> wrote in message
> news:3ED2E125.8010402@neurophys.wisc.edu...
> > rickman wrote:
> >
> > The short stubs are important, and the traces shouldn't have any sharp
> > angles - you want a really clean signal all around.  I assume you've got
> > ground planes and not a 2 layer board?  That'll help a lot too.
> 
> Why no sharp angles?
> 

There's a rebuttal of this concept here:

90 Degree Corners: The Final Turn. 
http://www.ultracad.com/90deg.pdf

along with a comical 'description' of what is 'thought' to be
happening:
Flying Electrons! 
http://www.ultracad.com/flying.htm

Enjoy!

Martin

-- 
martin.j.thompson@trw.com
TRW Conekt, Solihull, UK
http://www.trw.com/conekt

Article: 56133
Subject: Re: need help on sending 500Mbit/s data through 100 feet of cable, Giga-Ethernet?
From: christopher.saunter@durham.ac.uk (Christopher Saunter)
Date: Thu, 29 May 2003 10:12:25 +0000 (UTC)
Links: << >>  << T >>  << A >>
spyng (ospyng@yahoo.com) wrote:
: hi all,
: need some help on this,

: need to send about 500Mbit/s of data(images) between our equipment 100
: ft (30 meter ) apart. we have rule out USB 2.0 and Firewire, because
: of the distance. both are about 5 meter per segment (as I understand).
: so we are thinking of using Gigabit Ethernet, just the phy if
: possible. I am not too familiar with the Gigabit Ethernet, so here is
: the question

Have you considered using a fibre optic repeater for firewire?  I was 
looking into this a few months ago, and they exist, and transparently 
extend 1394/firewire to > 100m.  If I remember correctly a pair of boxes 
+ 30m fibre was arround us$1000.  Can't find the link I'm afraid, although 
I think it was from an AV specialist.

---

cds

Article: 56134
Subject: Re: need help on sending 500Mbit/s data through 100 feet of cable, Giga-Ethernet?
From: Uwe Bonnes <bon@elektron.ikp.physik.tu-darmstadt.de>
Date: Thu, 29 May 2003 10:17:02 +0000 (UTC)
Links: << >>  << T >>  << A >>
spyng <ospyng@yahoo.com> wrote:
: hi all,
: need some help on this,

: need to send about 500Mbit/s of data(images) between our equipment 100
: ft (30 meter ) apart. we have rule out USB 2.0 and Firewire, because
: of the distance. both are about 5 meter per segment (as I understand).
: so we are thinking of using Gigabit Ethernet, just the phy if
: possible. I am not too familiar with the Gigabit Ethernet, so here is
: the question

: 1) how difficult to create a custom-built interface in Xilinx Virtex
: II to GMII or TBI, just to transfer data? any pointer, link , that I
: could read about the GMII? I try google, but most of them is talking
: about interfacing with the MAC. not much on the protocol of GMII.

: 2) or must we use a MAC and implement TCP/IP stack? we trying to avoid
: that.

: 3) we are looking at LVDS too, National claim to be able to drive
: 300meter of cable with an Adjustable Output cable driver CLC001, any
: one have experience with it ?

: any comments,link, pointer are welcome. thanks

Have a look at http://www.inova-semiconductors.com/ for the Gigastar
Chip/Modules 

Bye
-- 
Uwe Bonnes                bon@elektron.ikp.physik.tu-darmstadt.de

Institut fuer Kernphysik  Schlossgartenstrasse 9  64289 Darmstadt
--------- Tel. 06151 162516 -------- Fax. 06151 164321 ----------

Article: 56135
Subject: Antifuse and SRAM FPGA
From: tatto0_2000@yahoo.com (Wong)
Date: 29 May 2003 03:40:20 -0700
Links: << >>  << T >>  << A >>
Hi FPGA experts,
  What are the differences between those Antifuse
FPGAs(One-time-programmable, OTP) and SRAM-based FPGAs(I called it
Many-time-programmable, MTP) ?    ;)
  If one is using antifuse and would like to 'migrate' to SRAM-based,
what are the *important* considerations before the migration ?
  As far as I know, one of the consideration would be the amount of
Sequential and Combinational cells are greatly 'sacrified' for RAMs in
SRAM-based FPGAs(compare both with the same amount of system gates).
Besides, SRAM-based FPGAs need external ROM during power-up to
download the bitstream. But look at Antifuse FPGAs, it is
S&C-cell-rich compare to SRAM-based and no extra device for
downloading the bitstream.
  Any other considerations ? Please feel free to correct me if I am
wrong.
  Thank you guys.

Article: 56136
Subject: Simplest PCI - VHDL core (BIOS Post port80 Tester)
From: antti@case2000.com (Antti Lukats)
Date: 29 May 2003 03:44:03 -0700
Links: << >>  << T >>  << A >>
Hi

I finally managed to get simeplest PCI design working in an FPGA!
well it is only a BIOS POST Code tester 2 digits of 7 seven segment LEDs
but its still nice to see some PCI board with FPGA doing something :)

below is the main module. except that only 7segment decoder and upper
wrapper and constraints file are needed. I think it works OK, as I
see BIOS code "C1" with award BIOS when no RAM chips are inserted.

---- BIOS POST CODE TESTER (VHDL)

--
-- Poor Man PCI Interface: BIOS POST Code Tester
-- Checked with Memec Spartan IIS200 PCI Board
-- Copyright 2003 Nugis Foundation. All rights reserved.
--

library ieee;
use ieee.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use IEEE.std_logic_unsigned.all;

entity PM_Pci is port(
  CLK:   in std_logic;
  FRAME: in std_logic;
  IRDY:  in std_logic;
  IDSEL: in std_logic;

  CBE0: in std_logic;
  CBE1: in std_logic;
  CBE2: in std_logic;
  CBE3: in std_logic;
  AD:   in  std_logic_vector(7 downto 0);
  -- 7 Seg display (2 digits)
  seg_1: out std_logic_vector(6 downto 0);
  seg_2: out std_logic_vector(6 downto 0));
end PM_Pci;

architecture arch of PM_Pci is

signal pci_io_write: std_logic;		
signal pci_state_idle: std_logic;
signal pci_state_1: std_logic;
signal pci_addr: std_logic_vector(7 downto 0);
signal port80_data: std_logic_vector(7 downto 0);		

COMPONENT led7seg_decoder PORT(
  data : IN std_logic_vector(3 downto 0);          
  seg : OUT std_logic_vector(6 downto 0));
END COMPONENT;

begin
	-- Segment LED decoders to port 80 Data register
	Inst_led7seg_decoder1: led7seg_decoder PORT MAP(
		data => port80_data(3 downto 0), seg => seg_1);
	Inst_led7seg_decoder2: led7seg_decoder PORT MAP(
		data => port80_data(7 downto 4), seg => seg_2);

     process (CLK)
	begin
		-- on falling edge of PCI Clock!
		if (CLK'event) and CLK = '0' then
		     -- transaction done
		     if (FRAME = '1') and (IRDY = '1') then
			  -- idle, abort transaction !
			  pci_state_idle <= '1';
			  pci_state_1 <= '0';

			else
			  -- transaction in progress, track ?
			  if pci_state_idle = '1' then
				-- to next state
				pci_state_1 <= '1';
				-- clear idle
				pci_state_idle <= '0';
				-- first phase after idle,latch!
 		          -- we are active, latch all stuff
			     -- First phase after FRAME went low!
			     pci_addr(7 downto 0) <= AD(7 downto 0); 
                             -- latch address!
			     -- and io write
 			     pci_io_write <= 
                              not CBE3 and not CBE2 and CBE1 and CBE0;
			  end if;
			  if pci_state_1 = '1' then
				pci_state_1 <= '0'; -- exit state
				-- if address xx80 then latch data
		          if (pci_io_write = '1') and 
                            (pci_addr(7 downto 0) = "10000000") then 
                      port80_data <= AD(7 downto 0); -- latchPORT 0080!				
			     end if;
                 end if;  
		     end if;
		end if;
	end process;
end arch;


----------- XST map report 

Number of errors:      0
Number of warnings:    0
Logic Utilization:
  Number of Slice Flip Flops:        11 out of  4,704    1%
  Number of 4 input LUTs:            22 out of  4,704    1%
Logic Distribution:
    Number of occupied Slices:                          19 out of  2,352    1%
    Number of Slices containing only related logic:     19 out of     19  100%
    Number of Slices containing unrelated logic:         0 out of     19    0%
        *See NOTES below for an explanation of the effects of unrelated logic
Total Number of 4 input LUTs:        22 out of  4,704    1%
   Number of bonded IOBs:            28 out of    284    9%
      IOB Flip Flops:                               8
   Number of GCLKs:                   1 out of      4   25%
   Number of GCLKIOBs:                1 out of      4   25%

Total equivalent gate count for design:  284

----
PS when I get time I make the all desing available at my
private e-publishing website
http://www.graphord.com

antti lukats

Article: 56137
Subject: Re: JTAG madness
From: rickman <spamgoeshere4@yahoo.com>
Date: Thu, 29 May 2003 08:21:02 -0400
Links: << >>  << T >>  << A >>
Fred Viles wrote:
> 
> rickman <spamgoeshere4@yahoo.com> wrote in
> news:3ED3A6C7.229A5223@yahoo.com:
> 
> >...
> > The second will have an ARM MCU and a large CPLD.
> >...
> 
> There are JTAG probes available for ARM, at least, that have no
> problem with there being mulitple TAPs on the JTAG chain.
> (MultiICE from ARM and MAJIC from EPI, just to name two).  Unless
> your CPLD programmer can't handle it, you shouldn't need to
> provide for isolating the TAPs in this section.
> 
> > I figure that a 1K resistor driving a 10 pF load will create a
> > 10 nS (roughly) rise time.  That should not create a problem
> > with the 10 MHz clocks typically found on JTAG.
> 
> FWIW, MAJIC runs up to 40 MHz TCLK.  But it's configurable.

To the best of my knowledge, the number of devices in the chain has
little to do with the hardware.  I belive it is entirely up to the
software.  I am not up to speed on the many suppliers of debugging
hardware and software for the ARM yet.  I have heard a lot of
recommendations, but not many provide info on *why* one combination is
good or bad.  One would assume the products from ARM are good, but they
are very clearly expensive.  I also attended a demo on their OKI
specific tools and saw a lot of "issues" cotrolling that chip.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 56138
Subject: Re: Altera hold violation errors
From: vbetz@altera.com (Vaughn Betz)
Date: 29 May 2003 05:33:25 -0700
Links: << >>  << T >>  << A >>
Hi Paulo,

It looks like what's happening is that you have your PLL clock going
through a register, and then clocking the RAM.  I would guess you've
done this to implement a divide by 2 clocking function in the
register.  Here's the relevant line from the timing report you posted:

>       Info: + Longest clock path from clock
> PLL1:PLL1_inst|altclklock:altclklock_component|outclock0 to
> destination memory is 7.774 ns
>         Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. =
> PLL_2; CLK Node = 'PLL1:PLL1_inst|altclklock:altclklock_component|outclock0'
>         Info: 2: + IC(1.710 ns) + CELL(0.507 ns) = 2.217 ns; Loc. =
> LC9_16_P1; REG Node =
> 'hudsonbay2_core:inst_hudsonbay2_core|card_hw_if:inst_Card_HW_IF|msacom_m_txr:inst_msacom_m_txr|TXR_CTRL:txr_ctrl1|RX1_WEN'
>         Info: 3: + IC(3.608 ns) + CELL(1.949 ns) = 7.774 ns; Loc. =
> EC6_1_P3; MEM Node =
> 'hudsonbay2_core:inst_hudsonbay2_core|card_hw_if:inst_Card_HW_IF|msacom_m_txr:inst_msacom_m_txr|rx_qpram_1:p1_rx_qpram_inst|lpm_ram_dpZ1:lpm_ram_dp_component|LPM_RAM_DP:U1|altdpram:sram|q[14]~mem_cell_din'

The skew between the PLL clock and this clock (call it reg_clock) is
about 6 ns.  And you have a register transfer from a register clocked
by PLL clock, to the memory clocked by reg_clock.  Your skew is
greater than your data delay, so you're getting a hold violation. 
There are lots of potential solutions.  In the order I tend to prefer
them, they are:

1.  Don't use reg_clock.  Use the PLL to generate all the clocks you
need, including the divide by two clock that I assume you're using the
register to create.

2.  Note that unless you tell Quartus the detailed relationship
between reg_clock and PLL clock, Quartus tries to figure it out
itself.  It has figured out they are phase-related, but if you have
logic to do things like make sure reg_clock is inverted with respect
to PLL clock, Quartus will not figure that out by default.  So you
have to tell Quartus the phase relationship by making clock settings
on reg_clock indicating it is based off PLL clock with a certain time
offset.  Quartus also will not understand that reg_clock is toggling
at only half the frequency of PLL clock (I'm assuming you're doing a
divide by 2 here) unless you tell it, again using clock settings.  See
AN123 on www.altera.com for details on how to make clock settings. 
The derived clocks section talks about the issue you're seeing here.

3.  Be clever about the active edge of reg_clock compared to PLL
clock.  Having both reg_clock and PLL clock be positive edge triggered
is causing timing problems due to the skew between them.  You may have
better luck if you make reg_clock negative edge triggered.  I can't
really say without knowing your design for sure.  The idea is to
separate the active edges by more delay.  Remember to make clock
settings as listed in 2 above to make sure Quartus knows the
relationship between the clocks.

<increasingly hacky solutions below!>

4.  Reduce the clock skew between PLL clock and reg_clock.  Is
reg_clock routed on a global network?  That means it'll have
moderately high skew relative to PLL clock (but low skew within the
reg_clock domain itself) due to the reasonably significant global
network delay.  If reg_clock is very low fanout, you may be able to
cut its skew compared to PLL clock by not using a global network for
it.  Use a global signal = off on reg_clock to make this happen.  You
will also want to make sure the register generating reg_clock is close
to the memory using it -- use a location assignment to do this.

5.  Increase the data delay between the register and the memory giving
the hold problem.  You can do this with location constraints (lock
them down reasonably far apart), or by inserting LEs on the signal
path between them to act as delay elements.  To insert LEs use the
"logic cell insertion" assignment on the register (source) node that
feeds your ram.  You can insert as many logic cells as you like.

Hope this helps.

Vaughn

prv3299@yahoo.com (Paulo Valentim) wrote in message news:<5ed45146.0305280706.65502179@posting.google.com>...
> Hi! I am working with an Altera Apex 20KE device. I have gotten some
> hold violation errors. I am using a PLL for the clock so I though
> clock skew was not going to be a problem. It turns out that in some
> RAMs (lpm_ram_dp), it is a problem. I don't really know what to do.
> Can somebody help me? Here is an example of the report from the
> Quartus software: Thanks a lot!
> 
> Info: Minimum slack time is -1.984 ns for clock
> PLL1:PLL1_inst|altclklock:altclklock_component|outclock0 between
> source register hudsonbay2_core:inst_hudsonbay2_core|card_hw_if:inst_Card_HW_IF|msacom_m_txr:inst_msacom_m_txr|com_rxd_14_
> and asynchronous destination memory
> hudsonbay2_core:inst_hudsonbay2_core|card_hw_if:inst_Card_HW_IF|msacom_m_txr:inst_msacom_m_txr|rx_qpram_1:p1_rx_qpram_inst|lpm_ram_dpZ1:lpm_ram_dp_component|LPM_RAM_DP:U1|altdpram:sram|q[14]~mem_cell_din
>   Info: + Shortest register to memory delay is 3.867 ns
>     Info: 1: + IC(0.000 ns) + CELL(0.165 ns) = 0.165 ns; Loc. =
> LC3_13_P3; REG Node =
> 'hudsonbay2_core:inst_hudsonbay2_core|card_hw_if:inst_Card_HW_IF|msacom_m_txr:inst_msacom_m_txr|com_rxd_14_'
>     Info: 2: + IC(1.033 ns) + CELL(2.669 ns) = 3.867 ns; Loc. =
> EC6_1_P3; MEM Node =
> 'hudsonbay2_core:inst_hudsonbay2_core|card_hw_if:inst_Card_HW_IF|msacom_m_txr:inst_msacom_m_txr|rx_qpram_1:p1_rx_qpram_inst|lpm_ram_dpZ1:lpm_ram_dp_component|LPM_RAM_DP:U1|altdpram:sram|q[14]~mem_cell_din'
>     Info: Total cell delay = 2.834 ns
>     Info: Total interconnect delay = 1.033 ns
>   Info: - Smallest register to memory requirement is 5.851 ns
>     Info: + Hold relationship between source and destination is 0.000
> ns
>       Info: + Latch edge is -4.365 ns
>         Info: Clock period of Destination clock
> PLL1:PLL1_inst|altclklock:altclklock_component|outclock0 is 15.151 ns
> with  offset of -4.365 ns and duty cycle of 50
>         Info: Multicycle Setup factor for Destination register is 1
>         Info: Multicycle Hold factor for Destination register is 1
>       Info: - Launch edge is -4.365 ns
>         Info: Clock period of Source clock
> PLL1:PLL1_inst|altclklock:altclklock_component|outclock0 is 15.151 ns
> with  offset of -4.365 ns and duty cycle of 50
>         Info: Multicycle Setup factor for Source register is 1
>         Info: Multicycle Hold factor for Source register is 1
>     Info: + Smallest clock skew is 6.064 ns
>       Info: + Longest clock path from clock
> PLL1:PLL1_inst|altclklock:altclklock_component|outclock0 to
> destination memory is 7.774 ns
>         Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. =
> PLL_2; CLK Node = 'PLL1:PLL1_inst|altclklock:altclklock_component|outclock0'
>         Info: 2: + IC(1.710 ns) + CELL(0.507 ns) = 2.217 ns; Loc. =
> LC9_16_P1; REG Node =
> 'hudsonbay2_core:inst_hudsonbay2_core|card_hw_if:inst_Card_HW_IF|msacom_m_txr:inst_msacom_m_txr|TXR_CTRL:txr_ctrl1|RX1_WEN'
>         Info: 3: + IC(3.608 ns) + CELL(1.949 ns) = 7.774 ns; Loc. =
> EC6_1_P3; MEM Node =
> 'hudsonbay2_core:inst_hudsonbay2_core|card_hw_if:inst_Card_HW_IF|msacom_m_txr:inst_msacom_m_txr|rx_qpram_1:p1_rx_qpram_inst|lpm_ram_dpZ1:lpm_ram_dp_component|LPM_RAM_DP:U1|altdpram:sram|q[14]~mem_cell_din'
>         Info: Total cell delay = 2.456 ns
>         Info: Total interconnect delay = 5.318 ns
>       Info: - Shortest clock path from clock
> PLL1:PLL1_inst|altclklock:altclklock_component|outclock0 to source
> register is 1.710 ns
>         Info: 1: + IC(0.000 ns) + CELL(0.000 ns) = 0.000 ns; Loc. =
> PLL_2; CLK Node = 'PLL1:PLL1_inst|altclklock:altclklock_component|outclock0'
>         Info: 2: + IC(1.710 ns) + CELL(0.000 ns) = 1.710 ns; Loc. =
> LC3_13_P3; REG Node =
> 'hudsonbay2_core:inst_hudsonbay2_core|card_hw_if:inst_Card_HW_IF|msacom_m_txr:inst_msacom_m_txr|com_rxd_14_'
>         Info: Total interconnect delay = 1.710 ns
>     Info: - Micro clock to output delay of source is 0.342 ns
>     Info: + Micro hold delay of destination is 0.129 ns
> 
> 
> 
>          -- Paulo Valentim

Article: 56139
Subject: Re: JTAG madness
From: rickman <spamgoeshere4@yahoo.com>
Date: Thu, 29 May 2003 08:35:32 -0400
Links: << >>  << T >>  << A >>
Arie de Muynck wrote:
> 
> It is only needed to disable and pullup the TMS input: that will clock '1'
> bits into TMS bringing a part into JTAG reset (causing normal operation of
> the part).
> It is a good habit to have at least a pullup on TMS: if not driven by JTAG
> and spikes occur on TCK at least the parts stay in normal operation.
> 
> Example: ARM core first in chain, the rest behind it, TDO from ARM routed to
> a select jumper (full chain or ARM only TDO to JTAG) and a jumper that
> disconnects to isolate the rest of the chain from TMS
> 
>                 +--------------------o A
>                 |                    o----------- TDO
> TDI ---  ARM  --+--  ...REST...  ----o B
>           |              |
> TMS ------+---o o--------+-- pullup
>                B
> 
> Close A for JTAG debugging of the ARM core, B for full chain JTAG boudary
> scan or CPLD programming etc.
> 
> The use of a separate TRST with an ARM core is mainly to allow RESET to keep
> the core in reset until the breakpoint registers have been set to break the
> core after RESET is released.
> Some ARM cores don't handle an instruction "0xF......." very well, they
> crash beyound JTAG reset. And unfortunately this is the content of an
> unprogrammed Flash...

This is what I planned to do for the DSP/FPGA chain.  Since the CPLDs
seem to have pretty good support for chained devices, I expect this will
be a sufficient solution for the ARM/CPLD chain as well.  Since it is
likely that I will use a custom connector on this device, I might even
be able to put the jumpers in the cable.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 56140
Subject: Re: Cyclone doesn't non-clock rom?
From: vbetz@altera.com (Vaughn Betz)
Date: 29 May 2003 05:37:27 -0700
Links: << >>  << T >>  << A >>
"leon qin" <leon.qin@2911.net> wrote in message news:<bb1ip2$4ldua$1@ID-185326.news.dfncis.de>...
> when I recompile an old design into Cyclone,it tell :
> 
> Assertion error: Can't convert ROM for Cyclone device family using
> altsyncram megafunction -- at least one clock is needed in order to
> implement benchmarking mode for altsyncram

This message is pretty cryptic.  Sorry.

This is complaining about a RAM that is illegal for Cyclone. 
Typically this is due to the RAM being asynchronous (not using
registers on its inputs).  This was supported in APEX / FLEX 10K, but
is not in Cyclone or Stratix.  I've sent a request off to the owner of
this code to get a more explicit message.

Vaughn

Article: 56141
Subject: Re: Antifuse and SRAM FPGA
From: rickman <spamgoeshere4@yahoo.com>
Date: Thu, 29 May 2003 08:40:01 -0400
Links: << >>  << T >>  << A >>
Wong wrote:
> 
> Hi FPGA experts,
>   What are the differences between those Antifuse
> FPGAs(One-time-programmable, OTP) and SRAM-based FPGAs(I called it
> Many-time-programmable, MTP) ?    ;)
>   If one is using antifuse and would like to 'migrate' to SRAM-based,
> what are the *important* considerations before the migration ?
>   As far as I know, one of the consideration would be the amount of
> Sequential and Combinational cells are greatly 'sacrified' for RAMs in
> SRAM-based FPGAs(compare both with the same amount of system gates).
> Besides, SRAM-based FPGAs need external ROM during power-up to
> download the bitstream. But look at Antifuse FPGAs, it is
> S&C-cell-rich compare to SRAM-based and no extra device for
> downloading the bitstream.
>   Any other considerations ? Please feel free to correct me if I am
> wrong.
>   Thank you guys.

I think you hit on the main differences.  But I am not clear on what you
mean by "sacrificed".  Both anti-fuse and SRAM based devices have
separate RAM blocks.  The Xilinx SRAM based devices can use the RAM
based LUTs as small blocks of RAM as well.  On the other hand, I believe
that you can create FFs (RAM) from logic in the anti-fuse parts.  So if
you have enough logic cells you can make RAM.  

Otherwise the differences are in the debugging stage.  The anti-fuse
parts require that you have a way to remove devices in order to change
your program and SRAM parts can be changed by reprogramming over a
cable.  

Also, don't forget Flash based parts which combine many of the features
of both.  Lattice has some XPLD and xFPGAs both using Flash *and* SRAM. 

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 56142
Subject: Re: 20 to 5 encoder optimization?
From: mrand@my-deja.com (Marc Randolph)
Date: 29 May 2003 05:55:41 -0700
Links: << >>  << T >>  << A >>
Gil Herbeck <gil@radix20.com> wrote in message news:<3ED595D3.7000807@radix20.com>...
> Muzaffer Kal wrote:
> 
> > Hi,
> > I need to implement 20 (one hot) to 5 encoder. The result would cover
> > 21 codes of the 32 available (one for no input active case). I am
> > hoping that I can do better than binary encoding. The question is how
> > do I find an optimal encoding where any unique assignment of the 21
> > codes with the remaining codes assigned in a don't care fashion would
> > be acceptable. The optimality criteria can be minimum area (say
> > minimum number of 2 input gates). One way woul be to write a script
> > which generates Verilog rtl for all 2^32 possible assignments and run
> > the synthesizer for a long time trying all of them. Any other good
> > ways ?
> Actually, the number of possible encodings is much greater than 2^32.
> It is ...
>      32! / 11!
> 
> Binary encode, leading zero count, trailing zero count, they are all
> roughly the same circuit really - a log2 structure.  You'll find them
> built into common synthesizers, but I think they are optimized for
> performance.  You can probably figure out a ripple structure that
> would save some area.

Or if you have multiple clock cycles, a 5 bit counter could sweep the
20 bit input looking for a one.  Since counters typically use
dedicated carry logic, it seems like the area consumed would be pretty
small.

Let us know what you decide on.

   Marc

Article: 56143
Subject: Re: JTAG madness
From: iddw@hotmail.com (Dave Hansen)
Date: Thu, 29 May 2003 12:58:08 GMT
Links: << >>  << T >>  << A >>
On Wed, 28 May 2003 23:41:02 GMT, eric.jacobsen@ieee.org (Eric
Jacobsen) wrote:

[...]

>...I think.  If I don't think, am I not?

So Rene Des Cartes walks into a bar.  The bartender asks, "Can I get
you a beer?"  Des Cartes says, "I think not," and vanishes.

>
>I'd better stop now.

Me too.  Regards,

                               -=Dave
-- 
Change is inevitable, progress is not.

Article: 56144
Subject: Re: New version,Low Speed
From: vbetz@altera.com (Vaughn Betz)
Date: 29 May 2003 06:02:25 -0700
Links: << >>  << T >>  << A >>
Hi Leon,

As Paul points out, some random bounce is pretty much unavoidable.  To
minimize the bounce / get the best result, you can "seed sweep" which
basically runs the compiler multiple times with slightly different
starting point.  By picking the best run, you can gain a little bit of
speed, by picking the luckiest compile.  This also tends to cut the
random bounce from release to release, of course at the expense of
compile time.

To do this, go to Tools->Tcl Scripts->Sweeper.  The script is pretty
self-explanatory -- tell it what you want, and how many compiles
you're willing to do.

You can also do this manually by using the "Initial Placement
Configuration" (sometimes called "Seed") setting in the
Assignments->Settings->Fitter Settings dialog.

As Paul pointed out, another option is to back-annotate the placement
and/or routing to "lock in" the results of a good compile.  All Altera
devices support back-annotation of placement.  Only Stratix,
Stratix-GX and Cyclone support back-annotation of routing.

When back-annotating placement to lock in performance, the best option
is typically to back-annotate and demote assignments to LABs (i.e.
don't lock down to the LE level).  This reduces, but does not
completely eliminate bounce unless you also lock down the routing too,
and unfortunately locking down the routing isn't supported for the
10K.

Vaughn

"Paul Leventis" <paul.leventis@utoronto.ca> wrote in message news:<kihza.10671$cK1.3416@news01.bloor.is.net.cable.rogers.com>...
> Hi Leon,
> 
> I don't know whether or not anything at all changed about the timing models
> or optimization techniques for the EP1K devices between these two releases.
> But the most likely explanation for the "problem" you're seeing is random
> noise.
> 
> All place and route algorithms suffer from random noise.  It is impossible
> to solve these NP problems perfectly -- so heuristics and various
> stoichastic optimization techniques are employed.  One by-product of this is
> that if anything at all changes about the problem -- the netlist, the timing
> model, the algorithm cost functions, your timing constraints, even the way
> floating-point numbers are truncated/rounded -- then the algorithms may get
> different results.
> 
> To observe this first-hand, make a slight change to your Fmax target (try
> 59.9, 59.95, 60, 60.05, 60.10).  What you should see is a "small" (5-10%???)
> random variation in final results uncorrelated with your Fmax target.  If
> you do this for both releases of Quartus and average each of the runs, you
> will probably find that the difference goes away.
> 
> For some or all devices in Quartus (I honestly don't know :-)) you can
> back-annote (or save) the placement and routing from one release and import
> it into the next.  This would eliminate the random noise between two
> releases and limits Fmax changes to those arising from timing models.  The
> down-side of doing this is that you forego any algorithm enhancements
> between releases as we are constantly improving the quality of place and
> route, primarily for our newer families.
> 
> Regards,
> 
> Paul Leventis
> Altera Corp.
> 
> [This is from spammable account]
> 
> "leon qin" <leon.qin@2911.net> wrote in message
> news:bak4hi$jjag$1@ID-185326.news.dfncis.de...
> > I had done a design with QuartusII2.1,fit to EP1K50-3,
> > and got Fmax at 61MHz.
> > Today when I try to fit it again with QuartusII 2.2SP2,
> > I can get Fmax  at 58MHz ONLY!
> > So stupid!!!
> >
> >                                leon
> >
> >

Article: 56145
Subject: Re: 20 to 5 encoder optimization?
From: nospam <nospam@nospam.invalid>
Date: Thu, 29 May 2003 14:04:07 +0100
Links: << >>  << T >>  << A >>
Muzaffer Kal <kal@dspia.com> wrote:

>Hi,
>I need to implement 20 (one hot) to 5 encoder. The result would cover
>21 codes of the 32 available (one for no input active case). I am
>hoping that I can do better than binary encoding. The question is how
>do I find an optimal encoding where any unique assignment of the 21
>codes with the remaining codes assigned in a don't care fashion would
>be acceptable. 

If you don't care about more than one hot input then the logic behind this
is just 5 multi input or gates. 

I assume the optimum solution will minimize and balance the input width of
these gates. 

0 for no hot inputs will already be decided, of the remaining 31 encodings
discard all with 4 or more 1's (15,23,27,29,30,31) then pick another 4 with
3 1's to balance the widths. I think it will come down to 3 x 9 input and 2
x 8 input gates. 

That said optimum is only marginally better than straight binary.



Article: 56146
Subject: Re: Simulation in Altera Quartus II
From: "Subroto Datta" <sdatta@altera.com>
Date: Thu, 29 May 2003 13:40:07 GMT
Links: << >>  << T >>  << A >>
Jens,
    Due to the transformational nature of logic synthesis, some internal
nets may not be present after the synthesis step. In order to create a
simulation vector set that consists of observable nodes, you can use the
Post Compilation Filter in the Node Finder. Net names listed in here are
present in the final synthesized and placed and routed database.
In general outputs of registers, logic cells and primary I/O are observable.

- Subroto Datta
Altera Corp.

Jens Nowack <its.me.hates-spam@uni.de> wrote in message
news:bb2547$52qpl$1@ID-192450.news.dfncis.de...
> Hallo,
>
> I have a lot of signals, inputs, outpits and variables in my VHDL-code and
> want to simulate it.
> In my vector source file I have selected some ,e.g. clk(signal), data_in
> (signal), data_out (signal) etc.
> After simulation some signal will not show in simulation waveforms.
> In simulation message window a massage like:
>
> Compiler synthesized away node s_enable_ram_a. Ignored vector source file
> node.
> Ignored node in vector source file. Can't find corresponding node name
> s_adress_a in design.
> Compiler synthesized away node s_pa2se_tmp[0]. Ignored vector source file
> node.
>
> In my VHDL-code this signals exist. Why does this warnings occure?
> How to watch variables, signal etc. during sumulation?
>
> Best regards
>
>
>
>



Article: 56147
Subject: Re: FIFO Controller
From: mrand@my-deja.com (Marc Randolph)
Date: 29 May 2003 06:51:29 -0700
Links: << >>  << T >>  << A >>
Peter Alfke wrote:
> I am looking at revamping the FIFO cores, giving you many options:
> asynchr. vs synchronous, with exact empty and full
> extra one-clock-early empty and full indicators
> programmable almost empty and full indicators, 
> readable occupied size ,
> etc
> Any additional suggestions?

Howdy Peter,

This might be more effort than you had in mind, but we've had a need
several times for an asymetric async FIFO... 8 bits in @ 155 MHz, 32
bits out @ 50 MHz.  And the reverse of course.  While I'm dreaming, we
could use them in both BRAM and LUT RAM form.

I know you asked for FIFO stuff, but while I'm making a list, here are
some things that would be very useful for RAM cores (not FIFOs). 
Either of these may not be possible due to architecture limitations -
in which case, perhaps keep them in mind for future devices.

1. I'll bet A LOT of designs use FPGA memory for an array of counters
or pointers (ie, a 20 or 32 bit counter/pointer per port on a 24 port
card) - we certainly do.  So in addition to dout, having a dout+1
could solve a little grief related to using them that way (meeting
timing without pipelining, which cuts your rate in half) and/or
possibly save some area (especially if the carry chain could share the
same slice as a LUT ram).

2. A triple port sync RAM.  Two ports read, one port write.


BTW, have you been able to find out anything on Hal's question of a
couple months ago on async resets:
 http://groups.google.com/groups?selm=v71unlhlf3ds74%40corp.supernews.com

Thank you,

   Marc

Article: 56148
Subject: Re: Multiply 19.44MHz with Virtex-II DCM
From: Austin Lesea <Austin.Lesea@xilinx.com>
Date: Thu, 29 May 2003 07:32:45 -0700
Links: << >>  << T >>  << A >>
Jay,

Yes, you lose the skew war, as you do not know what your timing is
anymore.  But, from the CLKIN to all of the outputs, you will have known
timing, and at 19.44 MHz, the skew difference is going to be so small,
that maybe you can ignore it?

Austin

Jay wrote:
> 
> Yes, I'v thought about the frequency doubler.
> But what I'm afraid of is: would any delay be introduced between fin and
> fout, for there is no guarantee of de-skew.
> Anyway, I would use it as "a tool of last resort".
> 
> "Austin Lesea" <Austin.Lesea@xilinx.com>
> ??????:3ED4C5C4.CCA011EA@xilinx.com...
> > Yes,
> >
> > One can put the input through a simple frequency doubler (see Peter's
> > circuit tricks Xclusive), and then into the input.  This gets the
> > frequency down to 12 MHz for the DLL.  One then uses the duty cycle
> > correction ON (to help fix the asymmetry of the doubled clock).  Since
> > taps are updated every 6 times the 2's complement of the jitter filter
> > settings, the asymmetry of the doubled clock does not violate the input
> > jitter specification.
> >
> > Haven't tried this, but there is no reason why it shouldn't work.  If
> > anyone has it working, let us know.
> >
> > Austin
> >
> > Heavenfish wrote:
> > >
> > > So my question is if there any alternate way to implement both DLL and
> DFS
> > > function when my input clk is less than 24MHz?
> > > or I have to change my application.
> > >
> > > "Austin Lesea" <Austin.Lesea@xilinx.com>
> > > ??????:3ED3C590.F0C93004@xilinx.com...
> > > > Jon,
> > > >
> > > > The DCM CLKFX feature works down to a 1 MHz input frequency (as long
> as
> > > > the output being synthesized is greater than 24 MHz).
> > > >
> > > > Note that you can not use "sync to DLL" (ie connect CLK0 to CLKFB) in
> > > > this mode (DFS only mode).
> > > >
> > > > Austin

Article: 56149
Subject: Re: JTAG madness
From: Keith Larson <k-larson2@NOSPAM.ti.com>
Date: Thu, 29 May 2003 09:44:22 -0500
Links: << >>  << T >>  << A >>
Hi Rick

The #1 challenge I can think of is to ensure a clean TCLK.  Basically 
TCLK is used to sample all of the other signals and as long as those are 
stable when TCLK transistions (cleanly), all should be well.

The underlying tricks that I can think of are...

1) Routing a clean TCLK to all devices simultaneosly.  A bad example 
would be daisy chaining devices and not having a far end termination. In 
that case you have intermediate levels along the TCLK line during the 
transition that can look like false clocks.  If you must daisy chain a 
lot of devices, either add a buffer every so often (and slow down TCLK 
accordingly), or drive the heck out of the line and add a far end 
termination equaling the Z of the lines in your board.

2) Do not push the TCLK rate so high that the setup and hold times are 
violated.  That is, resulting in clean and stable signals at the time of 
the (clean) TCLK edge.

3) Understanding that as the new generation of processors have evolved 
the input bandwidth for the pin input has also sky-rocketed.  What used 
to be OK may not be OK now.  It is kind of like deselecting the 20Mhz 
bandwidth limit on an oscilloscope.  All kinds of things become visible.

4) The TI JTAG standard for *debugging* processors is a super set of the 
JTAG *test* standard.  The TI JTAG debug version includes two additional 
signals, EMU0 and EMU1 that are used for multiprocessor halt and run (IE 
quite important to a number of customers that have LOTS of DSPs per board).

You will note however that the JTAG header pinout is *not* the same, nor 
can it be considering what it does.  This makes swapping in and out 
other vendors JTAG tools a pain in the butt, because you would need to 
create header adapters.  Simple, but I dont see any other way except to 
isolate the various JTAG chains.

Maybe someone could comment on the standardization of the JTAG test port 
  header (there might not be one?).

Hope this helps
Keith Larson

-------------------------------------
> I like most of what you said, but I don't think you can *debug* the
> board with all the TMS and TCK signals in parallel.  This would send the
> same instructions to all devices and put them in a possibly poor state
> for normal operation when you only intended to be controlling part P.  I
> expect to have to use jumpers to disable the TRST, TMS or TCK inputs on
> all parts other than P.  
> 
> 

+------------------------------------------+
|Keith Larson                              |
|Member Group Technical Staff              |
|Texas Instruments Incorporated            |
|                                          |
| 281-274-3288                             |
| k-larson2@ti.com                         |
|------------------------------------------+
|     TMS320C3x/C4x/VC33 Applications      |
|                                          |
|               TMS320VC33                 |
|    The lowest cost and lowest power      |
|    floating point DSP on the planet!     |
|              500uw/Mflop                 |
+------------------------------------------+




Site Home   Archive Home   FAQ Home   How to search the Archive   How to Navigate the Archive   
Compare FPGA features and resources   

Threads starting:
1994JulAugSepOctNovDec1994
1995JanFebMarAprMayJunJulAugSepOctNovDec1995
1996JanFebMarAprMayJunJulAugSepOctNovDec1996
1997JanFebMarAprMayJunJulAugSepOctNovDec1997
1998JanFebMarAprMayJunJulAugSepOctNovDec1998
1999JanFebMarAprMayJunJulAugSepOctNovDec1999
2000JanFebMarAprMayJunJulAugSepOctNovDec2000
2001JanFebMarAprMayJunJulAugSepOctNovDec2001
2002JanFebMarAprMayJunJulAugSepOctNovDec2002
2003JanFebMarAprMayJunJulAugSepOctNovDec2003
2004JanFebMarAprMayJunJulAugSepOctNovDec2004
2005JanFebMarAprMayJunJulAugSepOctNovDec2005
2006JanFebMarAprMayJunJulAugSepOctNovDec2006
2007JanFebMarAprMayJunJulAugSepOctNovDec2007
2008JanFebMarAprMayJunJulAugSepOctNovDec2008
2009JanFebMarAprMayJunJulAugSepOctNovDec2009
2010JanFebMarAprMayJunJulAugSepOctNovDec2010
2011JanFebMarAprMayJunJulAugSepOctNovDec2011
2012JanFebMarAprMayJunJulAugSepOctNovDec2012
2013JanFebMarAprMayJunJulAugSepOctNovDec2013
2014JanFebMarAprMayJunJulAugSepOctNovDec2014
2015JanFebMarAprMayJunJulAugSepOctNovDec2015
2016JanFebMarAprMayJunJulAugSepOctNovDec2016
2017JanFebMarAprMayJunJulAugSepOctNovDec2017
2018JanFebMarAprMayJunJulAugSepOctNovDec2018
2019JanFebMarAprMayJunJulAugSepOctNovDec2019
2020JanFebMarAprMay2020

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search