Messages from 106375

Article: 106375
Subject: Re: NgdBuild:604 error
From: "Felix Pang" <xiaofei.pang@gmail.com>
Date: 12 Aug 2006 09:38:27 -0700
Links: << >> << T >> << A >>


Mark McDougall wrote:
> Hi,
>
> I'm tearing my hair out and I can't find the answer to this in any of
> the Xilinx solutions!!!
>
> ---8<------8<------8<------8<------8<------8<------8<------8<------8<---
> NgdBuild:604 - logical block 'pace_inst/U_Game/vram_inst' with type
> vram' could not be resolved. A pin name misspelling can cause this, a
> missing edif or ngc file, or the misspelling of a type name. Symbol
> 'vram' is not supported in target 'spartan3'.
> ---8<------8<------8<------8<------8<------8<------8<------8<------8<---
>
it seems like that NgdBuild just can not find the netlist file for the
vram. Not quite sure why. 

Felix

Article: 106376
Subject: Re: Invoking Cadence NC Sim within Xilinx ISE
From: "ajeetha" <ajeetha@gmail.com>
Date: 12 Aug 2006 09:40:05 -0700
Links: << >> << T >> << A >>

Anil,
       I've not done this within ISE framework, but pretty much only on
Linux m/c. It is very straightforward:

  ncverilog -f flist

With "flist" being a file with a list of your source (Verilog)  files.
I personally am more favourite of a 3-step process -
ncvlog/ncelab/ncsim - but this should get you started. Feel free to
write to: tech_help <at> noveldv.com for more.If we find time we will
help you.

Regards
Ajeetha, CVC
www.noveldv.com
* A Pragmatic Approach to VMM Adoption 2006 ISBN 0-9705394-9-5
http://www.systemverilog.us/
* SystemVerilog Assertions Handbook
* Using PSL/Sugar

Article: 106377
Subject: Re: Clock domain crossing (again)
From: "KJ" <kkjennings@sbcglobal.net>
Date: Sat, 12 Aug 2006 16:40:29 GMT
Links: << >> << T >> << A >>


> That was exactly what I assumed.  Regardless, my approach will do the
> job in any case.  This is what we did in a telecom application I worked
> on where the data rate varied over a wide range from T1 to OC48, but
> the clock rate remained the same.  We maintained an internal bus width
> at 32 bits but used a clock enable to allow a very wide range of data
> rates.  Of course the input data and clock were synchronized in just
> the way I described above.  If the serial data is being shifted into an
> 8 or 32 bit register you just gate the 2.048 MHz enable pulse to only
> allow every Nth pulse through to produce a 2.048/N MHz enable signal.
>
> This approach is very common and not at all hard to understand.
>

I agree, it is the simplest and most straightforward approach and is likely 
what the original post is trying to accomplish.  With the 25:1 difference in 
clock speed anything else would be overkill.

KJ

Article: 106378
Subject: Re: JOP as SOPC component
From: "KJ" <kkjennings@sbcglobal.net>
Date: Sat, 12 Aug 2006 17:13:13 GMT
Links: << >> << T >> << A >>


"Martin Schoeberl" <mschoebe@mail.tuwien.ac.at> wrote in message 
news:44ddb2d4$0$8024$3b214f66@tunews.univie.ac.at...
> The Avalon bus is very flexible. Therefore, writing a slave or
> master (SOPC component) is not that hard. The magic is in the Avalon
> switch fabric generated by the builder. However, an example would
> have helped (Altera listening?). I didn't find anything on Altera's
> website or with Google. Now a very simple slave can be found at [1].
>
As you get into making your own components you'll find a lack of 
documentation about important things that go into the .PTF file.  Altera 
used to have a document on their website that was invaluable called the "PTF 
File Reference Manual" (or something like that).  They've chosen to pull 
that out so your only source for crucial information now is your FAE (maybe) 
or someone who happens to have that file available.  I've complained to 
Altera to no avail that they need to put that document back and maintain it 
or at least make it available upon request to component developers.  Maybe 
others also complaining will help as well (hint).

> One thing to take care: When you (like me) like to avoid VHDL files
> in the Quartus directory you can easily end up with three copies of
> your design files. Can get confusing which one to edit. When you
> edit your VHDL file in the component directory (the source for the
> SOPC builder) don't forget to rebuild your system. The build process
> copies it to your Quartus project directory.
>
Damn annoying too of the tool to do those copies like it does.  You have to 
be very careful about which file you edit as being the 'source' or it will 
get overwritten because it really isn't.

> The master is also ease: just address, read and write data,
> read/write and you have to react to waitrequest. See as example the
> SimpCon/Avalon bridge at [2]. The Avalon interconnect fabric handles
> all bus multiplexing, bus resizing, and control signal translation.
>
If you're going for a very high speed design and you have multiple masters 
accessing a slave (i.e. multiple CPUs, or DMA controllers accessing memory) 
the performance degrades rather quickly using SOPC Builder to perform the 
arbitration.  You don't necessarily need a large number of masters either, 
4-5 killed it for me and necessitated redesign to work around how Avalon 
handled things.

> Another point is, in my opinion, the wrong role who has to hold data
> for more than one cycle. This is true for several busses (e.g. also
> Wishbone). For these busses the master has to hold address and write
> data till the slave is ready. This is a result from the backplane
> bus thinking. In an SoC the slave can easily register those signals
> when needed longer and the master can continue.

What's you're describing is not an Avalon issue or a result of 'backplane 
bus thinking', and is not a limitation of Avalon.  If it exists in your 
design than it's a limitation of the slave component design.  The slave 
generates the wait request output which is used to tell the master that it 
needs to hold the address and data for it because it essentially doesn't 
have any space left to hold it itself.  If the slave component design has 
provisions to register and hold the address and data than it can do this and 
leave the wait request output not asserted and the cycle completes.  If you 
think about it, this would simply be a one deep fifo for holding the 
address/data/command.  If you generalize a bit more you would see that the 
fifo wouldn't need to be restricted to being only one deep and could be any 
depth.  So as the master device performs reads and writes these commands 
would be written into the fifo without asserting wait request but also 
remember that any fifo can fill up at which point the slave must assert wait 
request because it has no more room to store anything which means that the 
master device has to hold on to it for a bit.

> On the other hand,
> as JOP continues to execute and it is not so clear when the result
> is read, the slave should hold the data when available. That is easy
> to implement, but Wishbone and Avalon specify just a single cycle
> data valid.
>
What you would need then is a signal generated by the master back to the 
slave to say that the master isn't ready to receive the data and would then 
cause the slave to hold on to the read data.  But if you think about it a 
bit more, the only reason that the slave is providing read data in the first 
place is because the master device requested it in the first place.  If the 
master wasn't ready to receive data it should simply not assert the read 
signal command output.

By the way, Avalon has a leg up on Wishbone in regards to a cleaner logical 
approach to handling wait states and latency.  Avalon treats the address 
cycle as a single phase controllable by the slave's wait request and 
separates that from the read data phase by allowing for latency with the 
'readdatavalid' output.  With Wishbone you can accomplish the same thing by 
extending the bus definition with 'tags' but since not all components are 
required to support 'tags' when you have a mismatch you're on your own for 
getting the interconnect right.  With Avalon, they designed it right with a 
clear logical distinction between address and data phases so that any 
incompatibilites between master and slave can still be handled automatically 
by an automated tool (SOPC Builder).

KJ

Article: 106379
Subject: Re: Embedded clocks
From: "rickman" <spamgoeshere4@yahoo.com>
Date: 12 Aug 2006 10:16:40 -0700
Links: << >> << T >> << A >>

Brian Drummond wrote:
> On 11 Aug 2006 14:16:19 -0700, "rickman" <spamgoeshere4@yahoo.com>
> wrote:
>
> >Frank Buss wrote:
> >> rickman wrote:
> >>
> >> > Is self clocking on a single pin possible?  I am thinking that the
> >> > extra info has to be presented in some manner that requires either a
> >> > timing or amplitude measurement.
> >>
> >> As Jim wrote, the one-wire bus can do this. With this concept you need only
> >> one wire (and ground) to power and communicate with a device:
> >>
> >> http://pdfserv.maxim-ic.com/en/an/onewirebus.pdf
> >> http://www.maxim-ic.com/appnotes.cfm/an_pk/126
> >
> >Thanks to everyone for their posts.  Each of the above solutions
> >require timing of the signal and so will not work without a clock (or
> >timer) of a specified rate.  The key is "specified".  To decode a
> >machester stream you need a time interval of about 3/4 of the bit time
> >in order to blank the edge detector on the edge between bits.
>
> If you *know* it's Manchester coding, and have *no idea* of the clock
> rate, the problem can be solved if you can afford to spend some time
> framing up. It's harder if you instantly need to decode the first bit.
>
> Basic approach is to measure the times between transitions, compare
> several successive transition intervals, and classify them as "long" or
> "short" compared to each other. THEN take a mean value, apply blanking
> (clock recovery), and start framing up.
>
> (If you need to retroactively decode the first bit, you'll need to store
> and re-visit the first few transition times. This may be easier with the
> assistance of an embedded CPU)
>
> There will need to be some constraints on data, otherwise an infinite
> sequence of '0's or '1's would take infinitely long to decode. In SP/DIF
> or EBU/AES digital audio for example, this comes in the form of an
> extra-long transition interval (1.5 bit times) during a preamble, the
> trailing edge of which is guaranteed to correctly sync the clock
> recovery circuit.

I am familiar with how to recover Manchester data.  The problem is that
you *do* have to measure the clock rate or know it.

The thread has broken into two discussions. One is about how to recover
Manchester data and the minimum rate clock to use.  The other is about
self clocking data encoding and whether you can decode it without a
time reference.  In reality there is not a practical way to do that.  I
had not given the question much thought when I posed it and I see now
that all the "self clocking" schemes are framed in some rate and the
clock is recovered given a reference.

> >So I can't read a Manchester stream at 10 Mbps and one at 1
> >Mbps with the same timer.
>
> This approach should allow that - given some quiet time between
> different streams, to enable you to recognise a switch in rate.

Yes, you can in essence construct a very wide range PLL to decode a
Manchester signal.  It still uses a time reference and a very complex
one at that.  I was actually looking for a simple way to decode a
combined data and clock signal without having a time reference.  For
most practical purposes this is not doable.

An analog of what I would like to do is the I2C bus.  It is designed to
work at any rate down to 0 Hz.  Of course it uses a clock separate from
data.  It would be nice if that could be done on one digital wire
rather than two.  But I see that this is not practical without going to
multiple voltage levels or adding a reference clock.

Article: 106380
Subject: Re: Clock domain crossing (again)
From: burn.sir@gmail.com
Date: 12 Aug 2006 10:18:26 -0700
Links: << >> << T >> << A >>

rickman wrote:
> That was exactly what I assumed.  Regardless, my approach will do the
> job in any case.  This is what we did in a telecom application I worked
> on where the data rate varied over a wide range from T1 to OC48, but
> the clock rate remained the same.  We maintained an internal bus width
> at 32 bits but used a clock enable to allow a very wide range of data
> rates.  Of course the input data and clock were synchronized in just
> the way I described above.  If the serial data is being shifted into an
> 8 or 32 bit register you just gate the 2.048 MHz enable pulse to only
> allow every Nth pulse through to produce a 2.048/N MHz enable signal.
>
> This approach is very common and not at all hard to understand.


Could'nt agree more! One clock is all you need.

Anyone who have seen a UART with 3 clocks (system, bps, bps*16) raise
your hand!


-Burns

Article: 106381
Subject: Re: JOP as SOPC component
From: "KJ" <kkjennings@sbcglobal.net>
Date: Sat, 12 Aug 2006 17:42:53 GMT
Links: << >> << T >> << A >>


"Martin Schoeberl" <mschoebe@mail.tuwien.ac.at> wrote in message 
news:44ddc530$0$11352$3b214f66@tunews.univie.ac.at...
> That's fine for me. When the connection magic happens and I don't
> have to care it's fine. OK, one exception: Perhaps I would like
> to know more details on the latency. The switch fabric is 'plain'
> VHdL or Verilog. However, generated code is very hard to read.
>

What?  You don't have a display that can show 2000 columns on your screen as 
is nearly required to view the VHDL/Verilog that pops out of SOPC?

Actually the best place I've found to look at and understand the wait states 
and latency is simply the .PTF file since that's where all the information 
is.  Although the .PTF file requires a little bit of a learning curve due to 
the lack of documentation on Altera's part it's not that hard and once you 
get a feel for it, it is very easy to see if a slave device requires wait 
states (and if it does, is it a fixed number or controllable by the slave) 
and whether the slave device has any read latency (and if it does, it is a 
fixed number, or controllable by the slave, and how many reads can be 
pending at one time).  Looking at the VHDL is much harder and is not truly 
the source code anyway, the 'source' really is the .PTF file since the VHDL 
gets generated from it.

>
>> the avalon master is really as simple as the slave.
>
> Almost, you have to hold address, data and read/write active
> as long as waitrequest is pending. I don't like this, see above.
>

The master side is a bit more complicated than the slave side.

There is a very simple template though that one must almost always follow 
for the master.  When you try to deviate from it you're likely to get burned 
(voice of experience, I've already had to fix other's code in this area). 
The template is

process(Clock)
begin
    if rising_edge(Clock) then
        if (Reset = '1') then
            Read <= '0';
            Write <= '0';
        elsif (WaitRequest = '0') then
            -- Put your code here for whenever it is you want to read and/or 
write
            -- When writing you would also set WriteData here
            -- For example, if you're not ready to receive data whenever the 
slave says it is
            -- ready than you simply set Read <= '0' until you are ready.
        end if;
    end if;
end process;

For sampling the data on a read it depends on whether the master is 
implementing the 'Readdatavalid' input (i.e. 'latency aware' in Avalon 
terminology) or not.  If so, then you sample the data when readdatavalid is 
asserted, if not then sample the data when both the read output is asserted 
and the wait request is not.

> In my case e.g. the address from JOP (= top of stack) is valid
> only for a single cycle. To avoid one more cycle latency I present
> in the first cycle the TOS and register it. For additional wait
> cycles a MUX switches from TOS to the address register. I know this is a 
> slight violation of the Avalon specification.
> There can be some glitches on the MUX switch.

You might try looking at incorporating the above mentioned template and 
avoid the Avalon violation.  What I've also found in debugging other's code 
that doesn't adhere to the above template is that there can be subtle errors 
that take just the right combination of events to occur in order to cause an 
actual system error of some sort (i.e. not just the Avalon generated assert 
in simulation).  If you use the above template, you're guaranteed to be 
Avalon compliant and not have this issue.

In my opinion, the Avalon bus and the .PTF files to completely define 
component I/O interfaces is a huge improvement over Wishbone.  Although 
others disagree and don't like .PTF they don't offer any alternative 
definitions other than comments or documentation to defining all those 
interface things that one needs to know (i.e. wait states, latency, bus 
size, etc.).  Comments and documentation are nice, but they are not 
synthesizable whereas .PTF files are (i.e. SOPC Builder sucks them in and 
spits out VHDL/Verilog)....PTF may not be a standard anywhere outside of 
Altera, but then is there an open standard that defines a file format that 
can be used to accomplish what .PTF does?  I haven't run across it, and if 
there is one, I wouldn't mind badgering the tool vendors to support it to 
that I'm not locked into a vendor specific implementation until then I can 
be much more productive using PTF than not.

> For synchronous on-chip
> peripherals this is absolute not issue. However, this signals
> are also used for off-chip asynchronous peripherals (SRAM).
> However, I assume that this possible switching glitches are
> not really seen on the output pins (or at the SRAM input).

Again, if you use the template, you won't have the gliching even if the 
signals go off chip to a device.

KJ

Article: 106382
Subject: Re: JOP as SOPC component
From: "Martin Schoeberl" <mschoebe@mail.tuwien.ac.at>
Date: Sat, 12 Aug 2006 19:49:24 +0200
Links: << >> << T >> << A >>

>>> as very simple example for avalon master-slave type of peripherals there
>>> is on free avalon IP core for SD-card support the core can be found
>>> at some russian forum and later it was also added to the user ip
>>> section of the microtronix forums.
>>
>> Any link handy for this example?
>>
> http://forum.niosforum.com/forum/index.php?showtopic=4430
>
Nice, but not a real introductional example. It's a slave
and a master. Do you know what the master port is for in
this SD controller? And it looks like it's time for me to learn
a little bit Verilog - too many Verilog examples around ;-)

Martin

Article: 106383
Subject: Re: JOP as SOPC component
From: "Antti" <Antti.Lukats@xilant.com>
Date: 12 Aug 2006 10:55:51 -0700
Links: << >> << T >> << A >>

Martin Schoeberl schrieb:

> >>> as very simple example for avalon master-slave type of peripherals there
> >>> is on free avalon IP core for SD-card support the core can be found
> >>> at some russian forum and later it was also added to the user ip
> >>> section of the microtronix forums.
> >>
> >> Any link handy for this example?
> >>
> > http://forum.niosforum.com/forum/index.php?showtopic=4430
> >
> Nice, but not a real introductional example. It's a slave
> and a master. Do you know what the master port is for in
> this SD controller? And it looks like it's time for me to learn
> a little bit Verilog - too many Verilog examples around ;-)
>
> Martin

hm I guess I said master-slave in the first place.
the slave interface is to set up the sector adress and dma address and
start!
then the master interface transfers data from sd card to the memory on
avalon bus.

it looked like simple example to me :)

Antti

Article: 106384
Subject: Re: JOP as SOPC component
From: "KJ" <kkjennings@sbcglobal.net>
Date: Sat, 12 Aug 2006 18:04:50 GMT
Links: << >> << T >> << A >>

"Martin Schoeberl" <mschoebe@mail.tuwien.ac.at> wrote in message
news:44ddb2d4$0$8024$3b214f66@tunews.univie.ac.at...
> The Avalon bus is very flexible. Therefore, writing a slave or
> master (SOPC component) is not that hard. The magic is in the Avalon
> switch fabric generated by the builder. However, an example would
> have helped (Altera listening?). I didn't find anything on Altera's
> website or with Google. Now a very simple slave can be found at [1].
>
As you get into making your own components you'll find a lack of
documentation about important things that go into the .PTF file.  Altera
used to have a document on their website that was invaluable called the "PTF
File Reference Manual" (or something like that).  They've chosen to pull
that out so your only source for crucial information now is your FAE (maybe)
or someone who happens to have that file available.  I've complained to
Altera to no avail that they need to put that document back and maintain it
or at least make it available upon request to component developers.  Maybe
others also complaining will help as well (hint).

> One thing to take care: When you (like me) like to avoid VHDL files
> in the Quartus directory you can easily end up with three copies of
> your design files. Can get confusing which one to edit. When you
> edit your VHDL file in the component directory (the source for the
> SOPC builder) don't forget to rebuild your system. The build process
> copies it to your Quartus project directory.
>
Damn annoying too of the tool to do those copies like it does.  You have to
be very careful about which file you edit as being the 'source' or it will
get overwritten because it really isn't.

> The master is also ease: just address, read and write data,
> read/write and you have to react to waitrequest. See as example the
> SimpCon/Avalon bridge at [2]. The Avalon interconnect fabric handles
> all bus multiplexing, bus resizing, and control signal translation.
>
If you're going for a very high speed design and you have multiple masters
accessing a slave (i.e. multiple CPUs, or DMA controllers accessing memory)
the performance degrades rather quickly using SOPC Builder to perform the
arbitration.  You don't necessarily need a large number of masters either,
4-5 killed it for me and necessitated redesign to work around how Avalon
handled things.

> Another point is, in my opinion, the wrong role who has to hold data
> for more than one cycle. This is true for several busses (e.g. also
> Wishbone). For these busses the master has to hold address and write
> data till the slave is ready. This is a result from the backplane
> bus thinking. In an SoC the slave can easily register those signals
> when needed longer and the master can continue.

What's you're describing is not an Avalon issue or a result of 'backplane
bus thinking', and is not a limitation of Avalon.  If it exists in your
design than it's a limitation of the slave component design.  The slave
generates the wait request output which is used to tell the master that it
needs to hold the address and data for it because it essentially doesn't
have any space left to hold it itself.  If the slave component design has
provisions to register and hold the address and data than it can do this and
leave the wait request output not asserted and the cycle completes.  If you
think about it, this would simply be a one deep fifo for holding the
address/data/command.  If you generalize a bit more you would see that the
fifo wouldn't need to be restricted to being only one deep and could be any
depth.  So as the master device performs reads and writes these commands
would be written into the fifo without asserting wait request but also
remember that any fifo can fill up at which point the slave must assert wait
request because it has no more room to store anything which means that the
master device has to hold on to it for a bit.

> On the other hand,
> as JOP continues to execute and it is not so clear when the result
> is read, the slave should hold the data when available. That is easy
> to implement, but Wishbone and Avalon specify just a single cycle
> data valid.
>
What you would need then is a signal generated by the master back to the
slave to say that the master isn't ready to receive the data and would then
cause the slave to hold on to the read data.  But if you think about it a
bit more, the only reason that the slave is providing read data in the first
place is because the master device requested it in the first place.  If the
master wasn't ready to receive data it should simply not assert the read
signal command output.

By the way, Avalon has a leg up on Wishbone in regards to a cleaner logical
approach to handling wait states and latency.  Avalon treats the address
cycle as a single phase controllable by the slave's wait request and
separates that from the read data phase by allowing for latency with the
'readdatavalid' output.  With Wishbone you can accomplish the same thing by
extending the bus definition with 'tags' but since not all components are
required to support 'tags' when you have a mismatch you're on your own for
getting the interconnect right.  With Avalon, they designed it right with a
clear logical distinction between address and data phases so that any
incompatibilites between master and slave can still be handled automatically
by an automated tool (SOPC Builder).

KJ

Article: 106385
Subject: Re: JOP as SOPC component
From: "KJ" <kkjennings@sbcglobal.net>
Date: Sat, 12 Aug 2006 18:05:46 GMT
Links: << >> << T >> << A >>


"Martin Schoeberl" <mschoebe@mail.tuwien.ac.at> wrote in message
news:44ddc530$0$11352$3b214f66@tunews.univie.ac.at...
> That's fine for me. When the connection magic happens and I don't
> have to care it's fine. OK, one exception: Perhaps I would like
> to know more details on the latency. The switch fabric is 'plain'
> VHdL or Verilog. However, generated code is very hard to read.
>

What?  You don't have a display that can show 2000 columns on your screen as
is nearly required to view the VHDL/Verilog that pops out of SOPC?

Actually the best place I've found to look at and understand the wait states
and latency is simply the .PTF file since that's where all the information
is.  Although the .PTF file requires a little bit of a learning curve due to
the lack of documentation on Altera's part it's not that hard and once you
get a feel for it, it is very easy to see if a slave device requires wait
states (and if it does, is it a fixed number or controllable by the slave)
and whether the slave device has any read latency (and if it does, it is a
fixed number, or controllable by the slave, and how many reads can be
pending at one time).  Looking at the VHDL is much harder and is not truly
the source code anyway, the 'source' really is the .PTF file since the VHDL
gets generated from it.

>
>> the avalon master is really as simple as the slave.
>
> Almost, you have to hold address, data and read/write active
> as long as waitrequest is pending. I don't like this, see above.
>

The master side is a bit more complicated than the slave side.

There is a very simple template though that one must almost always follow
for the master.  When you try to deviate from it you're likely to get burned
(voice of experience, I've already had to fix other's code in this area).
The template is

process(Clock)
begin
    if rising_edge(Clock) then
        if (Reset = '1') then
            Read <= '0';
            Write <= '0';
        elsif (WaitRequest = '0') then
            -- Put your code here for whenever it is you want to read and/or
write
            -- When writing you would also set WriteData here
            -- For example, if you're not ready to receive data whenever the
slave says it is
            -- ready than you simply set Read <= '0' until you are ready.
        end if;
    end if;
end process;

For sampling the data on a read it depends on whether the master is
implementing the 'Readdatavalid' input (i.e. 'latency aware' in Avalon
terminology) or not.  If so, then you sample the data when readdatavalid is
asserted, if not then sample the data when both the read output is asserted
and the wait request is not.

> In my case e.g. the address from JOP (= top of stack) is valid
> only for a single cycle. To avoid one more cycle latency I present
> in the first cycle the TOS and register it. For additional wait
> cycles a MUX switches from TOS to the address register. I know this is a 
> slight violation of the Avalon specification.
> There can be some glitches on the MUX switch.

You might try looking at incorporating the above mentioned template and
avoid the Avalon violation.  What I've also found in debugging other's code
that doesn't adhere to the above template is that there can be subtle errors
that take just the right combination of events to occur in order to cause an
actual system error of some sort (i.e. not just the Avalon generated assert
in simulation).  If you use the above template, you're guaranteed to be
Avalon compliant and not have this issue.

In my opinion, the Avalon bus and the .PTF files to completely define
component I/O interfaces is a huge improvement over Wishbone.  Although
others disagree and don't like .PTF they don't offer any alternative
definitions other than comments or documentation to defining all those
interface things that one needs to know (i.e. wait states, latency, bus
size, etc.).  Comments and documentation are nice, but they are not
synthesizable whereas .PTF files are (i.e. SOPC Builder sucks them in and
spits out VHDL/Verilog)....PTF may not be a standard anywhere outside of
Altera, but then is there an open standard that defines a file format that
can be used to accomplish what .PTF does?  I haven't run across it, and if
there is one, I wouldn't mind badgering the tool vendors to support it to
that I'm not locked into a vendor specific implementation until then I can
be much more productive using PTF than not.

> For synchronous on-chip
> peripherals this is absolute not issue. However, this signals
> are also used for off-chip asynchronous peripherals (SRAM).
> However, I assume that this possible switching glitches are
> not really seen on the output pins (or at the SRAM input).

Again, if you use the template, you won't have the gliching even if the
signals go off chip to a device.

KJ

Article: 106386
Subject: Re: Gaisler on a Spartan 3E Starter Kit?
From: "Antti" <Antti.Lukats@xilant.com>
Date: 12 Aug 2006 11:15:45 -0700
Links: << >> << T >> << A >>

David M. Palmer schrieb:

> In article <110820062235029049%dmpalmer@email.com>, David M. Palmer
> <dmpalmer@email.com> wrote:
>
> > Gaisler has a nice suite of GPL'd IP for an AMBA-bussed Leon3 (SPARC)
> > system with Ethernet, DDR RAM, Spacewire, PCI, AES Crypto, and others.
> >  http://www.gaisler.com
>
> Following up, I sent this question to Gaisler, and he said:
>
> > The board uses a XC3S500 FPGA which about 10,000 cells.
> > You will be able to fit a minimum leon3 system, but not
> > much more. One problem is that the board uses 16-bit DDR
> > memory, but the leon3/grlib DDR controller can only handle
> > 32-bit memory banks.
> >
> > Jiri.
>
> So I may have to stick with OpenCores and/or whatever useful components
> I can extract from the Gaisler cores.
>
> --
> David M. Palmer  dmpalmer@email.com (formerly @clark.net, @ematic.com)

Hi David,

1) I have had LEON3 working in S3-400 fairly minimal system, so you
should be able to
get something working as well.

2) dont even dream of having Or1K uclinux ready system to fit s3e-500

3) you can experiment with MicroBlaze uclinux on s3e starterkit board
see link below it has full details and refernece designs and uclinux
images for the microblaze-uclinux for the s3e startkit

http://muranaka.info/pukiwiki/pukiwiki.php?MicroBlaze%20uClinux%20and%20Spartan-3E%20Starter%20Kit

the hardware rev03 file seems to be broken though the download stops at
200kb before file end :(

Antti

Article: 106387
Subject: Re: ISE Webpack 8.1 adder wierdness
From: "Todd Fleming" <tbfleming@gmail.com>
Date: 12 Aug 2006 11:17:17 -0700
Links: << >> << T >> << A >>

Ralf Hildebrandt wrote:
> I strongly guess that the flipflop has normal and inverted output.
> Therefore you get the inversion for free (for the cost of these flipflops).
> And furthermore it seems to be, that the pure combinational solutions
> are slightly to complex to fit into 8 LUTs (the inversion is too much to
> fit).
>
> Ralf

I don't see any reference to an inverted output on the flipflops in the
Spartan 3 data sheet.  From looking at the slice diagram in the DS and
the schematic for ADSU8 in the library guide, it looks like the LUTs
should be able to absorb the inverter on b.

Todd

Article: 106388
Subject: Re: 100 Mbit manchester coded signal in FPGA
From: John_H <johnhandwork@mail.com>
Date: Sat, 12 Aug 2006 18:19:54 GMT
Links: << >> << T >> << A >>

rickman wrote:
> First let me say that I am not trying to be rude in any way.  If you
> read my posts and see something that you find offensive, I did not
> intend that.  My comment below about reviewing Wikipedia was meant as a
> simple statement, not an insult.  So I apologize for anything that is
> perceived as offensive.  Please keep in mind that writing is very
> different from speaking.  Since tone can not be conveyed readily words
> can be interpreted very differently depending on the tone you perceive.

I appreciate that you recognize the ineffectiveness of communication and 
that you're not intending to be rude.  That helps.

> For the technical issues... The inversion is not the relevant issue.
> If you had an algorithm that would decode the stream I gave you as the
> inverted data I would have accepted that.  The problem is the timing.
> The way Manchester is decoded is to trigger a timer (it was a one shot
> back when I first worked on this problem) that will ignore any
> following transitions for approx 3/4 of a bit time.  This gives you
> +-1/4 of a bit time to allow for distortion and jitter in the signal.
> When you sample the incoming signal with a 3x clock or a 4x clock there
> are degenerate cases where the signal is sampled at the time it is
> changing which adds a full clock period to the jitter.  In both of
> these cases there is not enough margin to allow for this an you can get
> erroneous decoding.

If you choose to use a one-shot for the decoding, you are limited to a 
higher clock rate.  There is more than one way to do a decode.  The 
degenerate cases - all 1s, all 0s, repeating 0011 - can keep the data 
from *starting* a proper decode but cannot confuse the system once data 
*has* started.

> Your analysis, if I understood it correctly, produced 6 bits of data
> when there were only four.  I am also interested in the algorithm you
> used.  It would be instructive if you gave us the detail of how you
> decode the bit stream.

For your sampling challenged stream using the second half of the bit 
pair for data:

0000101111010000
|||:  :  :  :
000:  :<- first 3 bits indicates bit boundary at middle position
  000  : <- another says adjust boundary again
   001 :  <- the first pair is 0.1 or binary 1
       :  :  :Timing is now "locked"
      011 :<- the second pair is 0.1 or binary 1
         110 :<- 3rd pair 1.0 or binary 0
            100  <- 4th pair is 1.0 or binary 0

Full decode.
The first detection of a constant sequence of 3 bits will remove all 
ambiguity.  If the clock is known to be in the range (2x,4x) exclusive 
(please tighten the range for jitter or duty cycle) then there will be 
instances of at least 3 consecutive constant samples somewhere in the 
data stream.  The closer to 2x or the less data diversity, the longer it 
takes for initial lock bit it *will* lock on all data following that 
first "triple."

What your example missed was the occasional shortening of the full bit 
period.  I showed above bringing down three bits at a time.  The three 
bits would "slide" further by one bit if another sequence of 4 constant 
samples showed up, throwing away the extra bit and aligning a new bit 
trio for analysis.  The opposite of this slide is the need for a 
compression.  If the bit trio that's analyzed has a sequence of 010 or 
101, the bit pair is the first two bits (01. or 10. in the notation I 
used above) and the next bit trio starts with the 3rd bit in the current 
bit trio, not the bit after.

Triples (000 or 111) declare the starting point for bit trio analysis. 
If a four constant sample - a quad - is seen, it's just another triple 
one bit over that again declares a new starting point, throwing away one 
redundant bit.  Active bit trios (010 or 101) which are analyzed for the 
Manchester pair use only the first two bits for the pair and the 3rtd 
bit becomes the start of the next bit trio for analysis.

I'm getting a simulation together to run bunches of these sequences.  If 
the Xilinx tools support simulation, I should have that done today.  If 
not, tools at work need to be employed after hours.

> Ok, I think I understand where the extra 2 bits came from.  Somehow you
> assumed that the intial and final zeros were adjacent to ones and added
> extra edges that produced data.  So we can ignore those edges and the
> other data looks good.  But what was your algorithm?  You need to have
> a method that can be implemented in logic.  I am pretty confident that
> no matter what algorithm you choose, I can find a case where it won't
> work.

The first bit came from backing up the extraction in a sense suggested 
by Brian Drummond in the embedded clocks thread - retroactive decoding - 
along with the knowledge that the last half of the Manchester bit pair 
is 0.  Similarly the first half of the bit pair at the end of your 
sequence absolutely *starts* with a zero.  These known quantities 
weren't covered explicitly in the algorithm I've demonstrated but could 
have been.

<snip>

>> You mentioned the sequences need to be decoded to 1001 yet you decoded 1100.
>> At the 2x transmit output, the encoded sequence would be either 10100101 or
>> 01011010 depending on your polarity.  Is that what you were attempting to
>> show?  Or was it 1100?
> 
> Yes, the second bitstream produces a wrong pattern because of the
> jitter introduced.  That is my point.  You can decode the first
> bitstream because there is no distortion.  But the second bitstream
> shows that that distortion introduced by sampling on the transition
> will give errors and can not be avoided with a 3x or 4x clocking
> scheme.

The two bitstreams decoded identically in my example.  You said 1001 but 
you showed 1100.

<snip>

I'll have Verilog ready later.  It's specifically for the 2x-4x 
(exlusive) case and - like any Manchester decoder - will have a lock 
delay based on the data and sampling conditions.  This wide range means 
something about the rate must be known but no precision on that 
knowledge.  Simple RC oscillators could be used at both ends for a 3x 
sampler and work with this algorithm.

Manchester decoding with greater than 3x allows the sampling to be split 
into distinct halves where the error for sampling of N/2 and N-wide 
pulses in an Nx sampling scheme do not overlap.  They may abut at lower 
values and higher distortion but they don't overlap, allowing simpler 
decoding schemes.

Embedded clocks work.

- John_H

Article: 106389
Subject: Re: JOP as SOPC component
From: "Martin Schoeberl" <mschoebe@mail.tuwien.ac.at>
Date: Sat, 12 Aug 2006 20:47:33 +0200
Links: << >> << T >> << A >>

Hi KJ,

> get a feel for it, it is very easy to see if a slave device requires wait states (and if it does, is it a fixed number or 
> controllable by the slave) and whether the slave device has any read latency (and if it does, it is a

Yes, but e.g. for an SRAM interface there are some timings in ns. And
it's not that clear how this translates to wait states.

> The template is
>
> process(Clock)
> begin
>    if rising_edge(Clock) then
>        if (Reset = '1') then
>            Read <= '0';
>            Write <= '0';
>        elsif (WaitRequest = '0') then
>            -- Put your code here for whenever it is you want to read and/or write
>            -- When writing you would also set WriteData here
>            -- For example, if you're not ready to receive data whenever the slave says it is
>            -- ready than you simply set Read <= '0' until you are ready.
>        end if;
>    end if;
> end process;
>

I disagree on this template ;-) Perhaps, I'm wrong (as an Avalon newbie),
but: Why is all your active code in waitrequest='0'? From the
Avalon specification. You have to bring out address, read, write and
writedata to start the transaction - independent of waitrequest.
waitrequest=0 just ends your transaction.

From the specification (p 47, 49) it is allowed to start a read or
write transaction independent of the status of waitrequest. Did you
run into troubles with this?

Ok, after a second thought on your code it looks like you're starting
your actions at the last cycle of the former transaction. Mmh, kind
of strange thinking.

What about this version (sc_* signals are my internal master signals)

that case is the next state logic and combinatorial:

case state is

    when idl =>
        if sc_rd='1' then
            if av_waitrequest='0' then
                next_state <= rd;
            else
                next_state <= rdw;
            end if;
        elsif sc_wr='1' then
            if av_waitrequest='0' then
                next_state <= wr;
            else
                next_state <= wrw;
            end if;
        end if;

    when rdw =>
        if av_waitrequest='0' then
            next_state <= rd;
        end if;

    when rd =>
        next_state <= idl;

    -- here I could add the code from the idl
    -- state for back to back read and writes
...

sc_rd and sc_wr directly start setting read and write. However,
again I have to register them for keeping them set for wait
states (sc_rd and sc_wr are only valid for one cycle).
When there is a waitrequest, I'm just waiting.

Read data is registered in the state register process:

elsif rising_edge(clk) then

    state <= next_state;
    reg_rd <= '0';
...
    case next_state is

        when idl =>

        when rdw =>
            reg_rd <= '1';

        when rd =>
            reg_rd_data <= av_readdata;
...

That's my (violation) trick as an example on the Avalon read signal:

av_read <= sc_rd or reg_rd;


>> In my case e.g. the address from JOP (= top of stack) is valid
>> only for a single cycle. To avoid one more cycle latency I present
>> in the first cycle the TOS and register it. For additional wait
>> cycles a MUX switches from TOS to the address register. I know this is a slight violation of the Avalon specification.
>> There can be some glitches on the MUX switch.
>
> You might try looking at incorporating the above mentioned template and avoid the Avalon violation.  What I've also found in 
> debugging other's code

Then I get an additional cycle latency. That's what I want to avoid.

> that doesn't adhere to the above template is that there can be subtle errors that take just the right combination of events to 
> occur in order to cause an actual system error of some sort (i.e. not just the Avalon generated assert in simulation).  If you use 
> the above template, you're guaranteed to be Avalon compliant and not have this issue.

Good to hear the comments from one who struggled with Avalon.
However, I'm still not so happy with the style the bus is
specified. The first timing diagrams look more like an asynch.
SRAM timing specification with a clock drawn on top of it.
And then it goes on with slaves with fixed wait states. Why?
If do not provide a waitrequest in a slave that needs wait
states you can get into troubles when you specify it wrong
at component genration.

Or does the Avalon switch fabric, when registered, take this
information into account for the waitrequest of the master?
Could be for the SRAM component. Should look into the
generated VHDL code (or in a simulation)...


> In my opinion, the Avalon bus and the .PTF files to completely define component I/O interfaces is a huge improvement over 
> Wishbone.  Although

agree, that's nice.

>> For synchronous on-chip
>> peripherals this is absolute not issue. However, this signals
>> are also used for off-chip asynchronous peripherals (SRAM).
>> However, I assume that this possible switching glitches are
>> not really seen on the output pins (or at the SRAM input).
>
> Again, if you use the template, you won't have the gliching even if the signals go off chip to a device.

Again, one more cycle latency ;-)

Martin

Article: 106390
Subject: Re: Embedded clocks
From: Jim Granville <no.spam@designtools.maps.co.nz>
Date: Sun, 13 Aug 2006 06:55:42 +1200
Links: << >> << T >> << A >>

rickman wrote:
<snip>
> 
> If I am going to require a time reference at the receiver the simplest
> scheme I know of is just async serial data with a start and a stop bit.

This is not quite the simplest.

It imposes clock tolerance requirements, and is half duplex, so the
Transmit has to generate it's own clock.

If you want to ease that, you can do something like the LIN bus, which
gives a auto-baud pre-amble, but that is getting complex for CPLDs.

>  No point in using Manchester encoding if I am transferring the data
> over a wire just a few inches long.

Many TV remote's use manchester, and they do that to allow the use of RC 
clocks, and straight from battery operation.

If you want the simplest scheme, in a CPLD, use one-wire, because that
is duplex, and does not need to generate a TX clock, just a Tx time slot 
( which can be monostable derived ).

If you can get up to 2 wires, then i2c & variants are a widely used 
standard, and it does not take too much CPLD resource.

There is something of a flurry of PowerControl busses being released at 
the moment, some are one wire, and some are 2 wire.
Geberally, they try to be faster, and more low voltage tolerant, than i2c.

-jg

Article: 106391
Subject: Re: JOP as SOPC component
From: "Martin Schoeberl" <mschoebe@mail.tuwien.ac.at>
Date: Sat, 12 Aug 2006 21:15:59 +0200
Links: << >> << T >> << A >>

>> Another point is, in my opinion, the wrong role who has to hold data
>> for more than one cycle. This is true for several busses (e.g. also
>> Wishbone). For these busses the master has to hold address and write
>> data till the slave is ready. This is a result from the backplane
>> bus thinking. In an SoC the slave can easily register those signals
>> when needed longer and the master can continue.
>
> What's you're describing is not an Avalon issue or a result of 'backplane
> bus thinking', and is not a limitation of Avalon.  If it exists in your
> design than it's a limitation of the slave component design.  The slave

Ok, but what when I'm not writing the slave. At the moment I think
the master side.

> generates the wait request output which is used to tell the master that it
> needs to hold the address and data for it because it essentially doesn't
> have any space left to hold it itself.  If the slave component design has
> provisions to register and hold the address and data than it can do this

You could force the slave designers to register the address and data
if needed with a different specification - as SimpCon ;-)
Or you could allow non registering slaves, but register it
in the Avalon switch fabric for those slaves that do not
register the address and data by themself.

However, this is not only an issue with Avalon. It is the
same with Wishbone, OPB, AMBA, and OCP. So, perhaps
my idea is wrong ;-)


> leave the wait request output not asserted and the cycle completes.  If you
> think about it, this would simply be a one deep fifo for holding the
> address/data/command.  If you generalize a bit more you would see that the
> fifo wouldn't need to be restricted to being only one deep and could be any
> depth.  So as the master device performs reads and writes these commands
> would be written into the fifo without asserting wait request but also
> remember that any fifo can fill up at which point the slave must assert wait
> request because it has no more room to store anything which means that the
> master device has to hold on to it for a bit.

That idea is incorporated in a similar way in the SimpCon spec. See at:
http://www.opencores.org/cvsweb.cgi/~checkout~/simpcon/doc/simpcon.pdf

page 7, Figure 4. Perhaps it could be drawn a little bit
clearer.

>
>> On the other hand,
>> as JOP continues to execute and it is not so clear when the result
>> is read, the slave should hold the data when available. That is easy
>> to implement, but Wishbone and Avalon specify just a single cycle
>> data valid.
>>
> What you would need then is a signal generated by the master back to the
> slave to say that the master isn't ready to receive the data and would then
> cause the slave to hold on to the read data.  But if you think about it a
> bit more, the only reason that the slave is providing read data in the first
> place is because the master device requested it in the first place.  If the
> master wasn't ready to receive data it should simply not assert the read
> signal command output.

Why not? What about issue a read command and than just continue
with other instructions to hide the latency. Isn't this also the
idea of prefetching in newer processors?

> By the way, Avalon has a leg up on Wishbone in regards to a cleaner logical
> approach to handling wait states and latency.  Avalon treats the address

Agree, with Wishbone you can not issue overlapping transactions.

Martin

Article: 106392
Subject: Re: Dio5 interface with ps2 port
From: "Phil" <dont.smoke@gmail.com>
Date: 12 Aug 2006 12:28:19 -0700
Links: << >> << T >> << A >>


radarman wrote:
> Phil wrote:
> > Hi, I am trying to interface a keyboard with the Xilinx Dio5 board ps2
> > port using EDK(c dev kit).   From my understanding, the keyboard sends
> > out a low for 50ms(or whatever time) before it sends the scan code.  I
> > am confused on how i poll for this 50ms time.  Do i put it in my main{}
> > code to poll everytime for this scan code or does the keyboard have
> > some sort of interrupt?  In addition, do i have to map every single key
> > manually or is there another way? Thanks!
>
> I would suggest writing a PS/2 interface, or adapting an existing one.
> You really don't want software manually handling a slow peripheral like
> a keyboard or mouse. I have a PS/2 UART core that just handles
> recieving and transmitting - you have to implement scancode conversion
> and keyboard/mouse state in higher level logic or software.
>
> Note, you can use a ROM to handle the scancode conversion to ASCII, if
> all you need is text. Simply pass your scancode into the ROM as the
> address, and take the output of the ROM as your data. I would suggest,
> at a minimum, keeping track of the shift key. Use a state-bit as an
> additional input to your ROM, so you can provide both shifted, and
> non-shifted, characters.

I am not sure what you mean by the PS/2 UART core.  Do I have to write
this core myself or can i get it from somewhere?  My knowledge of this
stuff is extremely limited.  In addition, I was suggested to use 2
GPIOs to take the input of the clock and the input of hte keyboard
signal.  If i tried to implement it this way, do u have any idea what i
should do?

Article: 106393
Subject: Re: 100 Mbit manchester coded signal in FPGA
From: "rickman" <spamgoeshere4@yahoo.com>
Date: 12 Aug 2006 12:39:15 -0700
Links: << >> << T >> << A >>

I can't say I understand your algorithm exactly, but try it on this
example
00001100110010000

Can you describe your decoding in a way that can be implemented in
logic.  Even if it is a lookup table, it should be definable in logical
terms.


John_H wrote:
> For your sampling challenged stream using the second half of the bit
> pair for data:
>
> 0000101111010000
> |||:  :  :  :
> 000:  :<- first 3 bits indicates bit boundary at middle position
>   000  : <- another says adjust boundary again
>    001 :  <- the first pair is 0.1 or binary 1
>        :  :  :Timing is now "locked"
>       011 :<- the second pair is 0.1 or binary 1
>          110 :<- 3rd pair 1.0 or binary 0
>             100  <- 4th pair is 1.0 or binary 0

Article: 106394
Subject: Re: JOP as SOPC component
From: "KJ" <kkjennings@sbcglobal.net>
Date: Sat, 12 Aug 2006 19:44:20 GMT
Links: << >> << T >> << A >>


"Martin Schoeberl" <mschoebe@mail.tuwien.ac.at> wrote in message 
news:44de2247$0$28520$3b214f66@tunews.univie.ac.at...
>
> Yes, but e.g. for an SRAM interface there are some timings in ns. And
> it's not that clear how this translates to wait states.

Since Avalon is not directly compatible the typical SRAMs, this implies that 
you need to have an Avalon compatible component that translates Avalon into 
the particular SRAM that you're interested in.  In other words, you need an 
Avalon SRAM Controller component.  Once you have this component, you would 
just plop it down in SOPC Builder just like you would a DDR Controller, a 
PCI interface, or any other SOPC component.

Assuming for the moment, that you wanted to write the code for such a 
component, one would likely define that the component to have the following:
- A set of Avalon bus signals
- SRAM Signals that are defined as Avalon 'external' (i.e. they will get 
exported to the top level) so that they can be brought out of the FPGA.
- Generic parameters so that the actual design code does not need to hard 
code any of the specific SRAM timing requirements.

Given that, the VHDL code inside the SRAM controller would set it's Avalon 
side wait request high as appropriate while it physically performs the 
read/write to the external SRAM.  The number of wait states would be roughly 
equal to the SRAM cycle time divided by the Avalon clock cycle time.

Although maybe it sounds like a lot of work and you may think it results in 
some sort of 'inefficient bloat' it really isn't.  Any synthesizer will 
quickly reduce the logic to what is needed based on the usage of the design. 
What you get in exchange is very portable and reusable components.

>
>> The template is
>>
>> process(Clock)
>> begin
>>    if rising_edge(Clock) then
>>        if (Reset = '1') then
>>            Read <= '0';
>>            Write <= '0';
>>        elsif (WaitRequest = '0') then
>>            -- Put your code here for whenever it is you want to read 
>> and/or write
>>            -- When writing you would also set WriteData here
>>            -- For example, if you're not ready to receive data whenever 
>> the slave says it is
>>            -- ready than you simply set Read <= '0' until you are ready.
>>        end if;
>>    end if;
>> end process;
>>
>
> I disagree on this template ;-) Perhaps, I'm wrong (as an Avalon newbie),
> but: Why is all your active code in waitrequest='0'? From the
> Avalon specification. You have to bring out address, read, write and
> writedata to start the transaction - independent of waitrequest.
> waitrequest=0 just ends your transaction.

Not true.  The Avalon bus specification requires the master hold (i.e. not 
change) Address, WriteData, Read and Write if WaitRequest is '1'.  Given 
that the 'elsif' in the template insures that the inner code only gets 
executed when WaitRequest = '0'.

>
> From the specification (p 47, 49) it is allowed to start a read or
> write transaction independent of the status of waitrequest. Did you
> run into troubles with this?
>
That's true, you can 'start' a read/write transaction independent of wait 
request, the thing is that you can't end it or allow any of the outputs to 
change if waitrequest is active.

> Ok, after a second thought on your code it looks like you're starting
> your actions at the last cycle of the former transaction. Mmh, kind
> of strange thinking.

Not really, it is just simpler to say that I'm not going to go anywhere near 
code that can potentially change any of the outputs if wait request is 
active.  As an example, take a look at your code below where you've had to 
sprinkle the 'if av_waitrequest = '0' throughout the code to make sure you 
don't change states at the 'wrong' time (i.e. when av_waitrequest is 
active).  Where problems can come up is when you miss one of those 'if 
av_waitrequest = '0' statements.  Depending on just where exactly you missed 
putting it in is is where it can be a rather subtle problem to debug.

Now consider if you had simply put the 'if av_waitrequest = '0' statement 
around your entire case statement (with it understood that outside that 
though you would have to have the obligatory 'if reset go to idle').  Now it 
is much easier to see that your entire state machine will not change states 
on you at the wrong time...less code and more easily code inspected for 
correctness.  I've also seen it reduce the number of states required which 
simplifies the code even more.
>
> What about this version (sc_* signals are my internal master signals)
>
> that case is the next state logic and combinatorial:
>
> case state is
>
>    when idl =>
>        if sc_rd='1' then
>            if av_waitrequest='0' then
>                next_state <= rd;
>            else
>                next_state <= rdw;
>            end if;
>        elsif sc_wr='1' then
>            if av_waitrequest='0' then
>                next_state <= wr;
>            else
>                next_state <= wrw;
>            end if;
>        end if;
>
>    when rdw =>
>        if av_waitrequest='0' then
>            next_state <= rd;
>        end if;
>

>    when rd =>
>        next_state <= idl;
--- Are you sure you always want to go to idl?  This would probably cause an 
error if the avalon outputs were active in this state.
>
>    -- here I could add the code from the idl
>    -- state for back to back read and writes
> ...
>
> sc_rd and sc_wr directly start setting read and write. However,
> again I have to register them for keeping them set for wait
> states (sc_rd and sc_wr are only valid for one cycle).
> When there is a waitrequest, I'm just waiting.
>
> Read data is registered in the state register process:
>
> elsif rising_edge(clk) then
>
>    state <= next_state;
>    reg_rd <= '0';
> ...
>    case next_state is
>
>        when idl =>
>
>        when rdw =>
>            reg_rd <= '1';
>
>        when rd =>
>            reg_rd_data <= av_readdata;
> ...
>
> That's my (violation) trick as an example on the Avalon read signal:
>
> av_read <= sc_rd or reg_rd;

Whether it works or not for you would take more analysis, I'll just say that 
every time I've run across code that wasn't working for 'some reason' and I 
managed to trace it back to a mishandled Avalon data transfer and the Avalon 
master code did not match my template I was able to fix it by recoding it 
according to my template.  I'm sure there are other ways that 'can' work, 
but I think that my template is the simplest and easiest to verify by 
inspection.

>
>
>>> In my case e.g. the address from JOP (= top of stack) is valid
>>> only for a single cycle. To avoid one more cycle latency I present
>>> in the first cycle the TOS and register it. For additional wait
>>> cycles a MUX switches from TOS to the address register. I know this is a 
>>> slight violation of the Avalon specification.
>>> There can be some glitches on the MUX switch.
>>
>> You might try looking at incorporating the above mentioned template and 
>> avoid the Avalon violation.  What I've also found in debugging other's 
>> code
>
> Then I get an additional cycle latency. That's what I want to avoid.

Not on the Avalon bus, maybe for getting stuff into the template but even 
that is a handshake.  I've even used Avalon within components to transfer 
data between rather complicated processes just because it is a clean data 
transfer interface and still have no problem transferring data on every 
clock cycle when it is available.  I'm not familiar enough with your code, 
but I suspect that it can be done in your case as well.

>
>> that doesn't adhere to the above template is that there can be subtle 
>> errors that take just the right combination of events to occur in order 
>> to cause an actual system error of some sort (i.e. not just the Avalon 
>> generated assert in simulation).  If you use the above template, you're 
>> guaranteed to be Avalon compliant and not have this issue.
>
> Good to hear the comments from one who struggled with Avalon.
> However, I'm still not so happy with the style the bus is
> specified. The first timing diagrams look more like an asynch.
> SRAM timing specification with a clock drawn on top of it.
> And then it goes on with slaves with fixed wait states. Why?
> If do not provide a waitrequest in a slave that needs wait
> states you can get into troubles when you specify it wrong
> at component genration.

No, PTF files let you state that there are a fixed number of wait states and 
not have an explicit waitrequest on the slave.

>
> Or does the Avalon switch fabric, when registered, take this
> information into account for the waitrequest of the master?

It does.

> Could be for the SRAM component. Should look into the
> generated VHDL code (or in a simulation)...
>
I'd suggest looking at the system.ptf file for your design.
>
>> In my opinion, the Avalon bus and the .PTF files to completely define 
>> component I/O interfaces is a huge improvement over Wishbone.  Although
>
> agree, that's nice.
>
>>> For synchronous on-chip
>>> peripherals this is absolute not issue. However, this signals
>>> are also used for off-chip asynchronous peripherals (SRAM).
>>> However, I assume that this possible switching glitches are
>>> not really seen on the output pins (or at the SRAM input).
>>
>> Again, if you use the template, you won't have the gliching even if the 
>> signals go off chip to a device.
>
> Again, one more cycle latency ;-)
Again, nope not if done correctly.

KJ

Article: 106395
Subject: Re: 100 Mbit manchester coded signal in FPGA
From: John_H <johnhandwork@mail.com>
Date: Sat, 12 Aug 2006 20:07:53 GMT
Links: << >> << T >> << A >>

rickman wrote:
> I can't say I understand your algorithm exactly, but try it on this
> example
> 00001100110010000
   000 realign
    000 realign
     001 Manchester pair 0.1
        100 Manchester pair 1.0
           110 Manchester pair 1.0
              010 Manchester pair 01.
                000 realign
                 000 realign
     ------------ 001 (assumed) Manchester pair 0.1
      1  0  0  1   1
     ------------

> Can you describe your decoding in a way that can be implemented in
> logic.  Even if it is a lookup table, it should be definable in logical
> terms.

------- The rest of the post is just code -----------------------
Before simulation:

module
   Manchester
   ( input            clk
   , input            reset
   , input            datIn
   , output reg [1:0] ManchesterPair
   , output reg       usePair
   );

reg       r_reset = 1'b1;
reg       startup;
reg [4:0] rcv;
reg       short;
reg       long;
reg [1:0] bitStart;

always @(posedge clk)
begin
   r_reset <= reset;
   if( r_reset )
   begin
     startup        <= 1'b0;
     rcv            <= 5'b01000;
     ManchesterPair <= 2'h0;
     bitStart       <= 2'h0;
     usePair        <= 1'b0;
   end
   else
   begin
     if( ~startup )
     begin
       startup <= rcv[1];  // First 1 starts off valid receive
       rcv <= {rcv[1] ? 3'b101 : 3'b010, rcv[0], datIn};
     end
     else
       rcv <= {rcv[3:0],datIn};

     short <= rcv[3:1]==3'b010 | rcv[3:1]==3'b101;
     long  <= rcv[3:1]==3'b000 | rcv[3:1]==3'b111;

     if( bitStart== 2'h3 )
     begin
       ManchesterPair <= short ? {rcv[4],rcv[3]}
                               : {rcv[4],rcv[2]};
       bitStart <= short ? 2'h2 : 2'h1;
     end
     else                       // Either 3 or 4 samples
       bitStart <= long ? 2'h2  // start the valid data
                 : bitStart + (bitStart>2'h0);

     usePair <= bitStart==2'h3;
   end
end

endmodule

Article: 106396
Subject: virtex II inner organisation
From: flo <tnerolf@freesurf.fr>
Date: Sat, 12 Aug 2006 22:21:00 +0200
Links: << >> << T >> << A >>

Hi everyone,
I'm trying to deal with readback and scrubbing into a XC2V1500 FPGA.

I've got a problem identifying the Major Adress and the Minor Adress 
when I'm doing a readback.
I read documents (XAPP138 and XAPP151) but nothing works with virtexII.
I know the frame length and the number of frame because it is in the 
bitstream but nothing about the number of frame in each minor adress 
depending on the major address and the blockk type...

Does anyone know how to determine it?

Thanks a lot.

florent

Article: 106397
Subject: Re: JOP as SOPC component
From: "Martin Schoeberl" <mschoebe@mail.tuwien.ac.at>
Date: Sat, 12 Aug 2006 22:30:04 +0200
Links: << >> << T >> << A >>

that's almost like chatting - high speed news group
discussion ;-)

to keep up with your speed I've to split the answers
according to the sub topics. Here about the Avalon
SRAM interface.

>> Yes, but e.g. for an SRAM interface there are some timings in ns. And
>> it's not that clear how this translates to wait states.
>
> Since Avalon is not directly compatible the typical SRAMs, this implies that

Again disagree ;-) The Avalon specification also covers asynchronous
peripherals. That's adds to a little bit to the complexity of the
specification.

> Assuming for the moment, that you wanted to write the code for such a component, one would likely define that the component to 
> have the following:
> - A set of Avalon bus signals
> - SRAM Signals that are defined as Avalon 'external' (i.e. they will get exported to the top level) so that they can be brought 
> out of the FPGA.
> - Generic parameters so that the actual design code does not need to hard code any of the specific SRAM timing requirements.

Yes, that's the way it is described in the Quartus manual. I did my
SRAM interface in this way. Here is a part of the .ptf that describes
the timing of the external SRAM:

      SLAVE sram_tristate_slave
      {
         SYSTEM_BUILDER_INFO
         {
            ....
            Setup_Time = "0ns";
            Hold_Time = "2ns";
            Read_Wait_States = "18ns";
            Write_Wait_States = "10ns";
            Read_Latency = "0";
            ....

> Given that, the VHDL code inside the SRAM controller would set it's Avalon side wait request high as appropriate while it 
> physically performs the

There is no VHDL code associated with this SRAM. All is done by the
SOPC builder.

> read/write to the external SRAM.  The number of wait states would be roughly equal to the SRAM cycle time divided by the Avalon 
> clock cycle time.

The SOPC builder will translate the timing from ns to clock cycles for
me. However, this is a kind of iterative process as the timing of the
component depends on tco and tsu of the FPGA pins of the compiled design.
Input pin th can usually be ignored as it is covered by the minimum tco
of the output pins. The same is true for the SRAM write th.

> Although maybe it sounds like a lot of work and you may think it results in some sort of 'inefficient bloat' it really isn't.  Any 
> synthesizer will quickly reduce the logic to what is needed based on the usage of the design. What you get in exchange is very 
> portable and reusable components.

No, it's really not much work. Just a few mouse clicks (no VHDL) and the
synthesized result is not big. The SRAM tristate bridge contains just
the address and control output registers. I assume the input registers
are somwhere burried in the arbitrator.

Martin

Article: 106398
Subject: Re: 100 Mbit manchester coded signal in FPGA
From: "rickman" <spamgoeshere4@yahoo.com>
Date: 12 Aug 2006 13:31:07 -0700
Links: << >> << T >> << A >>

Ok, I thought this would fail and it did.  This sequence is the same
bit pattern as before, with one edge detected differently from jitter.

0001101110010000
0001100110010000
______^ - this should be pointing to the second zero after the 1->0
transition

Can you fix your algorithm to deal with this case?
Do you see what I am referring to about the jitter making it impossible
to distinguish the mid-bit transition from inter-bit transitions?

John_H wrote:
> rickman wrote:
> > I can't say I understand your algorithm exactly, but try it on this
> > example
> > 00001100110010000
>    000 realign
>     000 realign
>      001 Manchester pair 0.1
>         100 Manchester pair 1.0
>            110 Manchester pair 1.0
>               010 Manchester pair 01.
>                 000 realign
>                  000 realign
>      ------------ 001 (assumed) Manchester pair 0.1
>       1  0  0  1   1
>      ------------
>
> > Can you describe your decoding in a way that can be implemented in
> > logic.  Even if it is a lookup table, it should be definable in logical
> > terms.
>
> ------- The rest of the post is just code -----------------------
> Before simulation:
>
> module
>    Manchester
>    ( input            clk
>    , input            reset
>    , input            datIn
>    , output reg [1:0] ManchesterPair
>    , output reg       usePair
>    );
>
> reg       r_reset = 1'b1;
> reg       startup;
> reg [4:0] rcv;
> reg       short;
> reg       long;
> reg [1:0] bitStart;
>
> always @(posedge clk)
> begin
>    r_reset <= reset;
>    if( r_reset )
>    begin
>      startup        <= 1'b0;
>      rcv            <= 5'b01000;
>      ManchesterPair <= 2'h0;
>      bitStart       <= 2'h0;
>      usePair        <= 1'b0;
>    end
>    else
>    begin
>      if( ~startup )
>      begin
>        startup <= rcv[1];  // First 1 starts off valid receive
>        rcv <= {rcv[1] ? 3'b101 : 3'b010, rcv[0], datIn};
>      end
>      else
>        rcv <= {rcv[3:0],datIn};
>
>      short <= rcv[3:1]==3'b010 | rcv[3:1]==3'b101;
>      long  <= rcv[3:1]==3'b000 | rcv[3:1]==3'b111;
>
>      if( bitStart== 2'h3 )
>      begin
>        ManchesterPair <= short ? {rcv[4],rcv[3]}
>                                : {rcv[4],rcv[2]};
>        bitStart <= short ? 2'h2 : 2'h1;
>      end
>      else                       // Either 3 or 4 samples
>        bitStart <= long ? 2'h2  // start the valid data
>                  : bitStart + (bitStart>2'h0);
> 
>      usePair <= bitStart==2'h3;
>    end
> end
> 
> endmodule

Article: 106399
Subject: Re: Embedded clocks
From: "rickman" <spamgoeshere4@yahoo.com>
Date: 12 Aug 2006 13:33:32 -0700
Links: << >> << T >> << A >>

Jim Granville wrote:
> rickman wrote:
> <snip>
> >
> > If I am going to require a time reference at the receiver the simplest
> > scheme I know of is just async serial data with a start and a stop bit.
>
> This is not quite the simplest.
>
> It imposes clock tolerance requirements, and is half duplex, so the
> Transmit has to generate it's own clock.

But if I have an oscillator, I have a clock available.  That is my
point.  RS-232 has very loose requirements for a clock.  An RC may not
be good enough, but it doesn't take much.

> If you want to ease that, you can do something like the LIN bus, which
> gives a auto-baud pre-amble, but that is getting complex for CPLDs.

Way too complex.  I am looking at a very small package and I may be
limited to 64 logic cells.  In fact I don't know that I can make this
work in such a small part.  The problem is that one end of the link has
to be built into a cable housing where the signals are fanned out
again.  I don't need a lot of IO, but I expect it will take more than
64 logic cells.

> >  No point in using Manchester encoding if I am transferring the data
> > over a wire just a few inches long.
>
> Many TV remote's use manchester, and they do that to allow the use of RC
> clocks, and straight from battery operation.
>
> If you want the simplest scheme, in a CPLD, use one-wire, because that
> is duplex, and does not need to generate a TX clock, just a Tx time slot
> ( which can be monostable derived ).

I don't see one wire as being any simpler than a UART.  One wire is
just bit async rather than byte async.  You still need a timer to time
the bits.

> If you can get up to 2 wires, then i2c & variants are a widely used
> standard, and it does not take too much CPLD resource.

Yeah, I have thought about I2C, but it would have to run at High Speed
to work properly due to the addressing overhead.  SPI would work too,
but would use all four pins leaving us no spares.  A UART interface
could use two wires, one for transmit and one for receive.  The word
size can be application specific with dedicated bits for discrete
signals.  Most importantly, I think it will be the smallest in a CPLD.

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search