Messages from 156225

Article: 156225
Subject: Re: Math is hard
From: Brian Drummond <brian3@shapes.demon.co.uk>
Date: Sat, 18 Jan 2014 22:06:22 GMT
Links: << >> << T >> << A >>

On Fri, 17 Jan 2014 11:00:06 -0800, Rob Gaddi wrote:

> Hey y'all --
> 
> So this is one of those times that my lack of serious math chops is
> coming round to bite me, and none of my usual books is helping me out.
> I'm hoping someone has some thoughts.
> 
> I'm trying to approximate either exp(-1/(n+1)) or 4^(-1/n+1).  I can
> convince myself I don't care which.  n is an integer from 1-65535, and
> the result should be fixed-point fractional, probably U0.18.  The
> function output is always between 0-1, and goes up like a rocket for
> small n before leveling off to a steady cruise of >0.9 for the rest of
> the function domain.

Quadratic interpolation is easy, and gets arbitrarily close to most 
smooth curves.

Mock it up in a spreadsheet until its error is acceptably low, then 
transfer to HDL. If the quadratic coefficients end up being vanishingly 
small, then you can simplify to linear interpolation.

- Brian

Article: 156226
Subject: Re: embedded RAM vs. registers
From: alb <alessandro.basili@cern.ch>
Date: Sat, 18 Jan 2014 23:23:10 +0100
Links: << >> << T >> << A >>

Hi Gabor,

On 1/17/2014 10:53 PM, GaborSzakacs wrote:
[]
>> I'm trying to optimize the footprint of my firmware on the target device
>> and I realize there are a lot of parameters which might be stored in the
>> embedded RAM instead of dedicated registers.
[]
> It depends on the device you're targetting.  To some extent the tools
> can make use of embedded RAM without changing your RTL.  For example
> Xilinx tools allow you to place logic into unused BRAMs, and will
> automatically infer SRL's where the design allows it.

Uhm, apparently the Microsemi devices I'm using (IGLOO), together with
the toolset (Libero IDE) are not that smart to profit of the local
memory, unless I'm inadvertently asking *not* to use it. To be honest I
have not searched deeply for ram usage on these devices, but the
handbook does not provide any clue on 'use of RAM without changing RTL'.

> I've often used BRAM as a "shadow memory" to keep a copy of internal
> configuration registers for readback.  That can eliminate a large
> mux, at least for all register bits that only change when written.
> Read-only bits and self-resetting bits would still need a mux, but
> the overall logic could be reduced vs. a complete mux for all bits.

I guess I do not completely follow you here, which mux are you talking
about?

Al

p.s.: you are entitled to have your own opinion about Usenet and its
users' opinion, no more than I am.

Article: 156227
Subject: Re: embedded RAM vs. registers
From: Gabor <gabor@szakacs.org>
Date: Sat, 18 Jan 2014 22:30:24 -0500
Links: << >> << T >> << A >>

On 1/18/2014 5:23 PM, alb wrote:
> Hi Gabor,
>
> On 1/17/2014 10:53 PM, GaborSzakacs wrote:
> []
>>> I'm trying to optimize the footprint of my firmware on the target device
>>> and I realize there are a lot of parameters which might be stored in the
>>> embedded RAM instead of dedicated registers.
> []
>> It depends on the device you're targetting.  To some extent the tools
>> can make use of embedded RAM without changing your RTL.  For example
>> Xilinx tools allow you to place logic into unused BRAMs, and will
>> automatically infer SRL's where the design allows it.
>
> Uhm, apparently the Microsemi devices I'm using (IGLOO), together with
> the toolset (Libero IDE) are not that smart to profit of the local
> memory, unless I'm inadvertently asking *not* to use it. To be honest I
> have not searched deeply for ram usage on these devices, but the
> handbook does not provide any clue on 'use of RAM without changing RTL'.
>
>> I've often used BRAM as a "shadow memory" to keep a copy of internal
>> configuration registers for readback.  That can eliminate a large
>> mux, at least for all register bits that only change when written.
>> Read-only bits and self-resetting bits would still need a mux, but
>> the overall logic could be reduced vs. a complete mux for all bits.
>
> I guess I do not completely follow you here, which mux are you talking
> about?
>

In a system with a processor (external or embedded) you typically have
some form of bus to read and write registers within the FPGA.  Normally
you need the outputs of these registers all the time, so you can't just
implement the whole thing as RAM.  Now if the CPU wants to be able to
read back the values it wrote, you need a big readback multiplexer
(unless your IGLOO has internal tristate buffers) to select the register
you want to read back.  What I do is to have a RAM that keeps a copy of
what was written by the CPU.  Then the readback mux defaults to the
output of this (simple single-port) RAM unless the register is read-only
or has some side-effects that could change the register's value when
it's not being written by the CPU.  If you have a design with a whole
lot of registers, you can really reduce the size of the readback mux.

Of course you could save even more logic by not having readback for
values that only change when written by the CPU.  These become
"write-only" registers, and the software guy then needs to keep his
own "shadow" copy of the values he wrote if he needs to read it
back later.

> Al
>
> p.s.: you are entitled to have your own opinion about Usenet and its
> users' opinion, no more than I am.
>

Someone said, "Opinions are like a**holes.  Everyone has one, and they
all stink."  In any case I see you removed your signature from the
latest post. ;-)

-- 
Gabor

Article: 156228
Subject: Re: my first microZed board
From: mroberds@att.net
Date: Sun, 19 Jan 2014 06:56:13 +0000 (UTC)
Links: << >> << T >> << A >>

In sci.electronics.design John Larkin <jlarkin@highlandtechnology.com> wrote:
> https://dl.dropboxusercontent.com/u/53724080/PCBs/ASP_SN1_top.jpg

I think I recognize S2 and S3 for some reason.  :)

> Rather than being cautious, I just plugged 24 volts into it, and the
> Zed lit up and ran Linux.

Does it have a shell and Tetris installed?

Matt Roberds

Article: 156229
Subject: Re: my first microZed board
From: John Larkin <jjlarkin@highNOTlandTHIStechnologyPART.com>
Date: Sun, 19 Jan 2014 12:56:37 -0800
Links: << >> << T >> << A >>

On Sun, 19 Jan 2014 13:26:34 -0800, josephkk <joseph_barrett@sbcglobal.net>
wrote:

>On Fri, 17 Jan 2014 13:19:31 -0800, John Larkin
><jlarkin@highlandtechnology.com> wrote:
>
>>
>>
>>Just got this from production:
>>
>>https://dl.dropboxusercontent.com/u/53724080/PCBs/ASP_SN1_top.jpg
>
>This is a much better board picture than the ones i had complained about.

Which did you complain about?

>Did you find someone in your company to do them?

Yes. Me.

I did crank down the resolution for public presentation. The original is much
better.

I have an open-top light tent in the back of the building, next to the
north-facing windows. And a few boxes that work nicely as camera supports. The
north light is a little blue, but Irfanview fixes that.

Works well enough. Show us some pics of your boards.

-- 

John Larkin                  Highland Technology Inc
www.highlandtechnology.com   jlarkin at highlandtechnology dot com   

Precision electronic instrumentation

Article: 156230
Subject: Re: my first microZed board
From: josephkk <joseph_barrett@sbcglobal.net>
Date: Sun, 19 Jan 2014 13:26:34 -0800
Links: << >> << T >> << A >>

On Fri, 17 Jan 2014 13:19:31 -0800, John Larkin
<jlarkin@highlandtechnology.com> wrote:

>
>
>Just got this from production:
>
>https://dl.dropboxusercontent.com/u/53724080/PCBs/ASP_SN1_top.jpg

This is a much better board picture than the ones i had complained about.
Did you find someone in your company to do them?
>
>from previously posted layout...
>
>https://dl.dropboxusercontent.com/u/53724080/PCBs/P344_15.jpg
>
>
>This is a pretty serious signal processor application, but dropping
>the Zed on there makes it easy. We can plug a USB logic analyzer
>directly onto that Mictor connector, which has 16 signals and a clock
>from the uZed.
>
>Rather than being cautious, I just plugged 24 volts into it, and the
>Zed lit up and ran Linux.

Article: 156231
Subject: Re: my first microZed board
From: josephkk <joseph_barrett@sbcglobal.net>
Date: Sun, 19 Jan 2014 15:07:16 -0800
Links: << >> << T >> << A >>

On Sun, 19 Jan 2014 12:56:37 -0800, John Larkin
<jjlarkin@highNOTlandTHIStechnologyPART.com> wrote:

>On Sun, 19 Jan 2014 13:26:34 -0800, josephkk =
<joseph_barrett@sbcglobal.net>
>wrote:
>
>>On Fri, 17 Jan 2014 13:19:31 -0800, John Larkin
>><jlarkin@highlandtechnology.com> wrote:
>>
>>>
>>>
>>>Just got this from production:
>>>
>>>https://dl.dropboxusercontent.com/u/53724080/PCBs/ASP_SN1_top.jpg
>>
>>This is a much better board picture than the ones i had complained =
about.
>
>Which did you complain about?
>
>>Did you find someone in your company to do them?
>
>Yes. Me.
>
>I did crank down the resolution for public presentation. The original is=
 much
>better.
>
>I have an open-top light tent in the back of the building, next to the
>north-facing windows. And a few boxes that work nicely as camera =
supports. The
>north light is a little blue, but Irfanview fixes that.
>
>Works well enough. Show us some pics of your boards.

Sure, just as soon as i make one.  I am still looking for the one i did
over 40 years ago as a teen.  (Just when i get a bit bored though, =
doesn't
happen much.)

?-/

Article: 156232
Subject: Re: my first microZed board
From: Don Kuenz <garbage@crcomp.net>
Date: Sun, 19 Jan 2014 23:30:25 +0000 (UTC)
Links: << >> << T >> << A >>

In sci.electronics.design josephkk <joseph_barrett@sbcglobal.net> wrote:
> On Sun, 19 Jan 2014 12:56:37 -0800, John Larkin
> <jjlarkin@highNOTlandTHIStechnologyPART.com> wrote:
>
>>On Sun, 19 Jan 2014 13:26:34 -0800, josephkk <joseph_barrett@sbcglobal.net>
>>wrote:
>>
>>>On Fri, 17 Jan 2014 13:19:31 -0800, John Larkin
>>><jlarkin@highlandtechnology.com> wrote:
>>>
>>>>
>>>>
>>>>Just got this from production:
>>>>
>>>>https://dl.dropboxusercontent.com/u/53724080/PCBs/ASP_SN1_top.jpg
>>>
>>>This is a much better board picture than the ones i had complained about.
>>
>>Which did you complain about?
>>
>>>Did you find someone in your company to do them?
>>
>>Yes. Me.
>>
>>I did crank down the resolution for public presentation. The original is much
>>better.
>>
>>I have an open-top light tent in the back of the building, next to the
>>north-facing windows. And a few boxes that work nicely as camera supports. The
>>north light is a little blue, but Irfanview fixes that.
>>
>>Works well enough. Show us some pics of your boards.
>
> Sure, just as soon as i make one.  I am still looking for the one i did
> over 40 years ago as a teen.  (Just when i get a bit bored though, doesn't
> happen much.)

It's been a long time since I last etched a board. It's in the works,
but here's what happens meanwhile. :)

http://crcomp.net/Diary/PCB/386.jpg

--
      __
   __/  \
  /  \__/
  \__/    Don Kuenz
  /  \__
  \__/  \
     \__/

Article: 156233
Subject: Re: Math is hard
From: John Miles <jmiles@gmail.com>
Date: Sun, 19 Jan 2014 18:56:33 -0800 (PST)
Links: << >> << T >> << A >>

On Friday, January 17, 2014 11:00:06 AM UTC-8, Rob Gaddi wrote:
> Hey y'all --
>  ...
> Anyone have any thoughts?
> 

What are you actually using the function to accomplish?  Maybe there's a way to compute it incrementally.

-- john

Article: 156234
Subject: Re: embedded RAM vs. registers
From: alb <alessandro.basili@cern.ch>
Date: Mon, 20 Jan 2014 14:14:09 +0100
Links: << >> << T >> << A >>

Hi Gabor,

On 1/19/2014 4:30 AM, Gabor wrote:
[]
>> I guess I do not completely follow you here, which mux are you 
>> talking about?
>> 
> 
> In a system with a processor (external or embedded) you typically 
> have some form of bus to read and write registers within the FPGA. 
> Normally you need the outputs of these registers all the time, so
> you can't just implement the whole thing as RAM.

I follow you if you talk about 'state registers', which of course are
needed to keep the current state of the logic, but there are lots of
'configuration registers' which do not need constant access at
their values.

A simple example would be the configuration of an UART, you do not need
to know *constantly* that you need a parity bit or two stop bits. These
type of 'memory' can go in a RAM. Would you agree?

> Now if the CPU wants to be able to read back the values it wrote,
> you need a big readback multiplexer (unless your IGLOO has internal 
> tristate buffers) to select the register you want to read back.

Got your point about the multiplexer.

> What I do is to have a RAM that keeps a copy of what was written by 
> the CPU.

I tend to avoid local copies of information since they may not mirror
efficiently, leading to multiple sources of 'truth' which eventually may
bite you.
How do you guarantee on a cycle base that the two locations are
perfectly matching? What happens if they differ? If you do not need
cycle base accuracy then which location you rely upon?

> Then the readback mux defaults to the output of this (simple 
> single-port) RAM unless the register is read-only or has some 
> side-effects that could change the register's value when it's not 
> being written by the CPU.  If you have a design with a whole lot of 
> registers, you can really reduce the size of the readback mux.

I now understand your, indeed valid, point.

> 
> Of course you could save even more logic by not having readback for 
> values that only change when written by the CPU.  These become 
> "write-only" registers, and the software guy then needs to keep his 
> own "shadow" copy of the values he wrote if he needs to read it back 
> later.

see my opinion on multiple copies above.

[]
> Someone said, "Opinions are like a**holes.  Everyone has one, and 
> they all stink."

See, we are not too far apart with our own personal opinion on 'opinions'.

> In any case I see you removed your signature from the latest post. 
> ;-)

That is done automatically by my mailer when I'm not the OP, so do not
get too excited about that ;-)

Article: 156235
Subject: Re: Math is hard
From: Rob Gaddi <rgaddi@technologyhighland.invalid>
Date: Mon, 20 Jan 2014 10:11:46 -0800
Links: << >> << T >> << A >>

On Sat, 18 Jan 2014 22:06:22 GMT
Brian Drummond <brian3@shapes.demon.co.uk> wrote:

> On Fri, 17 Jan 2014 11:00:06 -0800, Rob Gaddi wrote:
> 
> > Hey y'all --
> > 
> > So this is one of those times that my lack of serious math chops is
> > coming round to bite me, and none of my usual books is helping me out.
> > I'm hoping someone has some thoughts.
> > 
> > I'm trying to approximate either exp(-1/(n+1)) or 4^(-1/n+1).  I can
> > convince myself I don't care which.  n is an integer from 1-65535, and
> > the result should be fixed-point fractional, probably U0.18.  The
> > function output is always between 0-1, and goes up like a rocket for
> > small n before leveling off to a steady cruise of >0.9 for the rest of
> > the function domain.
> 
> Quadratic interpolation is easy, and gets arbitrarily close to most 
> smooth curves.
> 
> Mock it up in a spreadsheet until its error is acceptably low, then 
> transfer to HDL. If the quadratic coefficients end up being vanishingly 
> small, then you can simplify to linear interpolation.
> 
> - Brian

I did a little playing with it again this morning, and just tried curve
fitting just the first small segment (n = 1:128).  Even a cubic fit has
immense errors over just that short of a range.  exp() really has
pretty vicious behavior.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 156236
Subject: Re: Math is hard
From: Rob Gaddi <rgaddi@technologyhighland.invalid>
Date: Mon, 20 Jan 2014 10:32:24 -0800
Links: << >> << T >> << A >>

On Fri, 17 Jan 2014 11:00:06 -0800
Rob Gaddi <rgaddi@technologyhighland.invalid> wrote:

> Hey y'all --
> 
> So this is one of those times that my lack of serious math chops is
> coming round to bite me, and none of my usual books is helping me out.
> I'm hoping someone has some thoughts.
> 
> I'm trying to approximate either exp(-1/(n+1)) or 4^(-1/n+1).  I can
> convince myself I don't care which.  n is an integer from 1-65535, and
> the result should be fixed-point fractional, probably U0.18.  The
> function output is always between 0-1, and goes up like a rocket for
> small n before leveling off to a steady cruise of >0.9 for the rest of
> the function domain.
> 
> I'm working in an FPGA, so I've got adds, multiplies, and table lookups
> from tables of reasonable size (10s of kb) cheap, but other things
> (divides especially) are expensive.  I can throw several clock cycles
> at the problem if need be.
> 
> Taylor series attacks seem to fail horribly.  I feel like there may be
> some answer where answers for n in [1,127] gets a direct table lookup,
> and n in [128,65535] gets some other algorithm, possibly with a table
> boost.  Or somehow taking advantage of the fact that log(1-f(n)) is
> related to log(n)?
> 
> Anyone have any thoughts?
> 
> -- 
> Rob Gaddi, Highland Technology -- www.highlandtechnology.com
> Email address domain is currently out of order.  See above to fix.H

Thanks for the all the ideas, everyone.  I think at the end of the day,
I'm going to solve this problem by choosing to solve a different
problem instead.

The application was so trivial as to not be nearly worth all this
nonsense; a programmable asymmetric debounce of a digital input with
linearish behavior (a little momentary glitch should only slow the
resolution a bit, not reset the entire counter).  The spec as
written called for the time delay to be programmable in steps of
10us.  "Huh" I say to myself. "I'll just model this as anti-parallel
diodes, programmable resistors, a cap, and a Schmitt trigger."  After
all, tau is conveniently linear on R in continuous time.

So you've got y[n] = (1-a)x[n] + ay[n-1], and you get the result that
for a time-constant of N clocks, you get a=exp(-1/(N+1)) for limits
of .36 and .63, or a=4(-1/(N+1)) for 0.25 and 0.75.  And then you get
the reality that crunching that equation is apparently horrible.

So the new plan is, since no one's overwhelmingly committed to that
particular implementation, that instead delays will be specified as the
slew rate at which an up/down counter will be fed; proportional to
1/delay, and I'll make the customer do the divide himself in the
comfort and privacy of his own FPU.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 156237
Subject: Re: Math is hard
From: robert bristow-johnson <rbj@audioimagination.com>
Date: Mon, 20 Jan 2014 17:11:41 -0500
Links: << >> << T >> << A >>

On 1/20/14 1:07 PM, Rob Gaddi wrote:
> On Sat, 18 Jan 2014 05:33:02 -0800 (PST)
> radams2000@gmail.com wrote:
>
>> Math: the agony and dx/dt
>>
>
> Sorry, Bob, but I've heard pretty much all the math puns there are.
> That one's just derivative.
>

  <groan>

On 1/20/14 1:11 PM, Rob Gaddi wrote:
>
...
>
> I did a little playing with it again this morning, and just tried curve
> fitting just the first small segment (n = 1:128).  Even a cubic fit has
> immense errors over just that short of a range.  exp() really has
> pretty vicious behavior.

not as bad as log.  i can fit exp() over an octave range rather nicely 
with a 4th-order polynomial.  log() needs a 6th-order to be as good over 
an octave.

but to do the asymptotic exp(-x), any polynomial used to fit this will 
fail at the tail.  you would need 1/x or some other function with a 
similar asymptote in there.  so division is unavoidable, unless you do LUT.

-- 

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."

Article: 156238
Subject: Re: Math is hard
From: David Brown <david.brown@hesbynett.no>
Date: Tue, 21 Jan 2014 10:45:35 +0100
Links: << >> << T >> << A >>

On 20/01/14 19:11, Rob Gaddi wrote:
> On Sat, 18 Jan 2014 22:06:22 GMT
> Brian Drummond <brian3@shapes.demon.co.uk> wrote:
> 
>> On Fri, 17 Jan 2014 11:00:06 -0800, Rob Gaddi wrote:
>>
>>> Hey y'all --
>>>
>>> So this is one of those times that my lack of serious math chops is
>>> coming round to bite me, and none of my usual books is helping me out.
>>> I'm hoping someone has some thoughts.
>>>
>>> I'm trying to approximate either exp(-1/(n+1)) or 4^(-1/n+1).  I can
>>> convince myself I don't care which.  n is an integer from 1-65535, and
>>> the result should be fixed-point fractional, probably U0.18.  The
>>> function output is always between 0-1, and goes up like a rocket for
>>> small n before leveling off to a steady cruise of >0.9 for the rest of
>>> the function domain.
>>
>> Quadratic interpolation is easy, and gets arbitrarily close to most 
>> smooth curves.
>>
>> Mock it up in a spreadsheet until its error is acceptably low, then 
>> transfer to HDL. If the quadratic coefficients end up being vanishingly 
>> small, then you can simplify to linear interpolation.
>>
>> - Brian
> 
> I did a little playing with it again this morning, and just tried curve
> fitting just the first small segment (n = 1:128).  Even a cubic fit has
> immense errors over just that short of a range.  exp() really has
> pretty vicious behavior.
> 

You can't fit a cubic (or other polynomial) by itself - you need a
spline.  And for a curve like this, you want varying steps, with far
more steps at the lower range.  You might be able to divide it up using
the first zero as the index (i.e., ranges 1..2, 2..4, 4..8, 8..16, etc.)
 The lowest range - say up to 16 - is probably best handled with a
straight lookup table.  After that, you'd need 12 sets of coefficients.
 If cubic interpolation here is not good enough, try higher odd powers
(anchor your polynomials at the end points and the derivatives at these
points to get a smooth spline).

Article: 156239
Subject: Re: embedded RAM vs. registers
From: GaborSzakacs <gabor@alacron.com>
Date: Tue, 21 Jan 2014 11:28:54 -0500
Links: << >> << T >> << A >>

alb wrote:
> Hi Gabor,
> 
> On 1/19/2014 4:30 AM, Gabor wrote:
> []
>>> I guess I do not completely follow you here, which mux are you 
>>> talking about?
>>>
>> In a system with a processor (external or embedded) you typically 
>> have some form of bus to read and write registers within the FPGA. 
>> Normally you need the outputs of these registers all the time, so
>> you can't just implement the whole thing as RAM.
> 
> I follow you if you talk about 'state registers', which of course are
> needed to keep the current state of the logic, but there are lots of
> 'configuration registers' which do not need constant access at
> their values.
> 
> A simple example would be the configuration of an UART, you do not need
> to know *constantly* that you need a parity bit or two stop bits. These
> type of 'memory' can go in a RAM. Would you agree?
> 

Not at all.  The UART needs to know how many stop bits and what sort of
parity to use whenever it transmits data.  That can be completely
asynchronous to the CPU data bus.  If the UART needed to get this info
from RAM, it would need another address port to that RAM.  That's a very
inefficient use of hardware to avoid storing 2 or 3 bits in a separate
register.  If you meant that the UART would read the RAM and then keep
a local copy, how is this different (in terms of resource usage) than
just having the register implemented in flip-flops?

>> Now if the CPU wants to be able to read back the values it wrote,
>> you need a big readback multiplexer (unless your IGLOO has internal 
>> tristate buffers) to select the register you want to read back.
> 
> Got your point about the multiplexer.
> 
>> What I do is to have a RAM that keeps a copy of what was written by 
>> the CPU.
> 
> I tend to avoid local copies of information since they may not mirror
> efficiently, leading to multiple sources of 'truth' which eventually may
> bite you.
> How do you guarantee on a cycle base that the two locations are
> perfectly matching? What happens if they differ? If you do not need
> cycle base accuracy then which location you rely upon?
> 
>> Then the readback mux defaults to the output of this (simple 
>> single-port) RAM unless the register is read-only or has some 
>> side-effects that could change the register's value when it's not 
>> being written by the CPU.  If you have a design with a whole lot of 
>> registers, you can really reduce the size of the readback mux.
> 
> I now understand your, indeed valid, point.
> 
>> Of course you could save even more logic by not having readback for 
>> values that only change when written by the CPU.  These become 
>> "write-only" registers, and the software guy then needs to keep his 
>> own "shadow" copy of the values he wrote if he needs to read it back 
>> later.
> 
> see my opinion on multiple copies above.

This is indeed an issue whenever you use this technique to save
resources.  I look at it as a trade-off.  In the case of readback
for read/write bits that only change when written by the CPU, the
only time you would be out of synch is at start-up.  In my case
I would either make a rule that the software must write every
register at least once before it could be read back, or I would
program the "RAM" with the initial register values at config time.
This works on Xilnx parts, where the configuration bitstream has
bits for all BRAM locations.  Not all FPGA's can do this, though.
Anyway, I thought this thread was about saving device resources...

-- 
Gabor

Article: 156240
Subject: Re: Math is hard
From: Rob Gaddi <rgaddi@technologyhighland.invalid>
Date: Tue, 21 Jan 2014 09:18:00 -0800
Links: << >> << T >> << A >>

On Tue, 21 Jan 2014 10:45:35 +0100
David Brown <david.brown@hesbynett.no> wrote:

> On 20/01/14 19:11, Rob Gaddi wrote:
> > 
> > I did a little playing with it again this morning, and just tried curve
> > fitting just the first small segment (n = 1:128).  Even a cubic fit has
> > immense errors over just that short of a range.  exp() really has
> > pretty vicious behavior.
> > 
> 
> You can't fit a cubic (or other polynomial) by itself - you need a
> spline.  And for a curve like this, you want varying steps, with far
> more steps at the lower range.  You might be able to divide it up using
> the first zero as the index (i.e., ranges 1..2, 2..4, 4..8, 8..16, etc.)
>  The lowest range - say up to 16 - is probably best handled with a
> straight lookup table.  After that, you'd need 12 sets of coefficients.
>  If cubic interpolation here is not good enough, try higher odd powers
> (anchor your polynomials at the end points and the derivatives at these
> points to get a smooth spline).
> 

Not sure I'm following you there, this could be a problem with my
understanding.  When I hear people talking about spline fitting, to my
mind that's a series of piecewise polynomial approximations.  Piecewise
linear being the degenerate case.  Is that a reasonable statement?

The overall range I was trying to span was integer N 1:65535.  Trying
to fit a cubic to the range 1:128 was at attempt to see whether even
going to 512 (linearly spaced) pieces was going to give me a decent
approximation.  At least in the high curvature section of small N, the
results were ghastly.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 156241
Subject: Re: Math is hard
From: Randy Yates <yates@digitalsignallabs.com>
Date: Tue, 21 Jan 2014 12:33:53 -0500
Links: << >> << T >> << A >>

Rob Gaddi <rgaddi@technologyhighland.invalid> writes:

> On Tue, 21 Jan 2014 10:45:35 +0100
> David Brown <david.brown@hesbynett.no> wrote:
>
>> On 20/01/14 19:11, Rob Gaddi wrote:
>> > 
>> > I did a little playing with it again this morning, and just tried curve
>> > fitting just the first small segment (n = 1:128).  Even a cubic fit has
>> > immense errors over just that short of a range.  exp() really has
>> > pretty vicious behavior.
>> > 
>> 
>> You can't fit a cubic (or other polynomial) by itself - you need a
>> spline.  And for a curve like this, you want varying steps, with far
>> more steps at the lower range.  You might be able to divide it up using
>> the first zero as the index (i.e., ranges 1..2, 2..4, 4..8, 8..16, etc.)
>>  The lowest range - say up to 16 - is probably best handled with a
>> straight lookup table.  After that, you'd need 12 sets of coefficients.
>>  If cubic interpolation here is not good enough, try higher odd powers
>> (anchor your polynomials at the end points and the derivatives at these
>> points to get a smooth spline).
>> 
>
> Not sure I'm following you there, this could be a problem with my
> understanding.  When I hear people talking about spline fitting, to my
> mind that's a series of piecewise polynomial approximations.  Piecewise
> linear being the degenerate case.  Is that a reasonable statement?
>
> The overall range I was trying to span was integer N 1:65535.  Trying
> to fit a cubic to the range 1:128 was at attempt to see whether even
> going to 512 (linearly spaced) pieces was going to give me a decent
> approximation.  At least in the high curvature section of small N, the
> results were ghastly.

Ron, why is such a fancy debouncing algorithm necessary? Just curious.
-- 
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com

Article: 156242
Subject: Re: Math is hard
From: robert bristow-johnson <rbj@audioimagination.com>
Date: Tue, 21 Jan 2014 12:43:10 -0500
Links: << >> << T >> << A >>

On 1/21/14 12:18 PM, Rob Gaddi wrote:
> On Tue, 21 Jan 2014 10:45:35 +0100
> David Brown<david.brown@hesbynett.no>  wrote:
>
>> On 20/01/14 19:11, Rob Gaddi wrote:
>>>
>>> I did a little playing with it again this morning, and just tried curve
>>> fitting just the first small segment (n = 1:128).  Even a cubic fit has
>>> immense errors over just that short of a range.  exp() really has
>>> pretty vicious behavior.
>>>
>>
>> You can't fit a cubic (or other polynomial) by itself - you need a
>> spline.  And for a curve like this, you want varying steps, with far
>> more steps at the lower range.  You might be able to divide it up using
>> the first zero as the index (i.e., ranges 1..2, 2..4, 4..8, 8..16, etc.)
>>   The lowest range - say up to 16 - is probably best handled with a
>> straight lookup table.  After that, you'd need 12 sets of coefficients.
>>   If cubic interpolation here is not good enough, try higher odd powers
>> (anchor your polynomials at the end points and the derivatives at these
>> points to get a smooth spline).
>>
>
> Not sure I'm following you there, this could be a problem with my
> understanding.  When I hear people talking about spline fitting, to my
> mind that's a series of piecewise polynomial approximations.

that's normally what i thought.  however, when i look up the math for 
"cubic splines" it doesn't seem to be exactly the same as fitting either 
3rd-order Lagrange nor 3-order Hermite polynomials.  i would have 
expected it to come out as one or the other.

>  Piecewise
> linear being the degenerate case.  Is that a reasonable statement?

who are you calling "degenerate"?!    :-)

probably piecewise-constant is more degenerate.  maybe zeros inserted 
between points is even more degenerate.  other than approximating the 
whole thing with zero (or some other constant), i cannot think of a more 
degenerate case.

> The overall range I was trying to span was integer N 1:65535.  Trying
> to fit a cubic to the range 1:128 was an attempt to see whether even
> going to 512 (linearly spaced) pieces was going to give me a decent
> approximation.  At least in the high curvature section of small N, the
> results were ghastly.

maybe direct lookup for the first 128 and some equally-spaced polynomial 
splines for the remaining 511 regions?

-- 

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."

Article: 156243
Subject: Re: Math is hard
From: Rob Gaddi <rgaddi@technologyhighland.invalid>
Date: Tue, 21 Jan 2014 09:47:52 -0800
Links: << >> << T >> << A >>

On Tue, 21 Jan 2014 12:33:53 -0500
Randy Yates <yates@digitalsignallabs.com> wrote:

> Rob Gaddi <rgaddi@technologyhighland.invalid> writes:
> 
> > On Tue, 21 Jan 2014 10:45:35 +0100
> > David Brown <david.brown@hesbynett.no> wrote:
> >
> >> On 20/01/14 19:11, Rob Gaddi wrote:
> >> > 
> >> > I did a little playing with it again this morning, and just tried curve
> >> > fitting just the first small segment (n = 1:128).  Even a cubic fit has
> >> > immense errors over just that short of a range.  exp() really has
> >> > pretty vicious behavior.
> >> > 
> >> 
> >> You can't fit a cubic (or other polynomial) by itself - you need a
> >> spline.  And for a curve like this, you want varying steps, with far
> >> more steps at the lower range.  You might be able to divide it up using
> >> the first zero as the index (i.e., ranges 1..2, 2..4, 4..8, 8..16, etc.)
> >>  The lowest range - say up to 16 - is probably best handled with a
> >> straight lookup table.  After that, you'd need 12 sets of coefficients.
> >>  If cubic interpolation here is not good enough, try higher odd powers
> >> (anchor your polynomials at the end points and the derivatives at these
> >> points to get a smooth spline).
> >> 
> >
> > Not sure I'm following you there, this could be a problem with my
> > understanding.  When I hear people talking about spline fitting, to my
> > mind that's a series of piecewise polynomial approximations.  Piecewise
> > linear being the degenerate case.  Is that a reasonable statement?
> >
> > The overall range I was trying to span was integer N 1:65535.  Trying
> > to fit a cubic to the range 1:128 was at attempt to see whether even
> > going to 512 (linearly spaced) pieces was going to give me a decent
> > approximation.  At least in the high curvature section of small N, the
> > results were ghastly.
> 
> Ron, why is such a fancy debouncing algorithm necessary? Just curious.
> -- 
> Randy Yates
> Digital Signal Labs
> http://www.digitalsignallabs.com

That's just it, at the end of the day I decided it wasn't.  The
asymmetry is really handy because it means the end customer can decide
between setting the rise/fall symmetrically and having basically just a
bandlimited view of the input, or asymmetrically, which provides glitch
detection; you can have 10s of ms to get around to looking for a brief
spike.  So that feature's handy.

But the actual implementation, whereby a first-order exponential
averager can be programmed with linear time constants?  Cute when it
looked like there was some simple way to get there.  As the
complication level got worse and worse, other options that are "a bit
of a nuisance" started looking a whole lot better.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 156244
Subject: Re: embedded RAM vs. registers
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Tue, 21 Jan 2014 21:56:27 +0000 (UTC)
Links: << >> << T >> << A >>

GaborSzakacs <gabor@alacron.com> wrote:

(snip)
>>> In a system with a processor (external or embedded) you typically 
>>> have some form of bus to read and write registers within the FPGA. 
>>> Normally you need the outputs of these registers all the time, so
>>> you can't just implement the whole thing as RAM.

>> I follow you if you talk about 'state registers', which of course are
>> needed to keep the current state of the logic, but there are lots of
>> 'configuration registers' which do not need constant access at
>> their values.

>> A simple example would be the configuration of an UART, you do not need
>> to know *constantly* that you need a parity bit or two stop bits. These
>> type of 'memory' can go in a RAM. Would you agree?

> Not at all.  The UART needs to know how many stop bits and what sort of
> parity to use whenever it transmits data.  That can be completely
> asynchronous to the CPU data bus.  If the UART needed to get this info
> from RAM, it would need another address port to that RAM.  That's a very
> inefficient use of hardware to avoid storing 2 or 3 bits in a separate
> register.  If you meant that the UART would read the RAM and then keep
> a local copy, how is this different (in terms of resource usage) than
> just having the register implemented in flip-flops?

If you think of it that way, (and sometimes I do) then the 
microprocessor is the biggest waste of transistors ever 
invented. A huge number of transistors, now in the billions, 
to get data into, and out of, an arithmetic-logic-unit 
containing thousands of transistors.

Most of the time, a large fraction of the logic isn't doing
anything at all! 

Consider the old favorite of introductory digital logic 
laboratory courses, the digital clock. Almost nothing happens
most of the time (ignore display multiplex for now), but once
a second the display is updated. In the 1970s, you would build
one out of TTL chips. Though the FF's had the ability to
switch at MHz rates, here they ran at 1Hz or less. (Well,
divide down from 60Hz.) Again, the transistors are being
wasted, but now in the time domain instead of the spatial
domain.

A small MCU, with small, built-in RAM and ROM (maybe external 
ROM) has plenty of power to run a digital clock. Many more
transistors than the TTL version, and they are used more often
than the TTL version, but the economy of scale of building
small MCUs more than makes up for it.

As to the previous question, how to build a UART.

If you look inside a terminal server (not that anyone uses
them anymore) you find a microprocessor in place of 8 UARTs.
A single mircoprocessor is fast enough to collect the bits
from eight incoming serial ports, and drive the bits into
eight outgoing ports, along with keeping up the TCP connections
to the ethernet port. 

I am sure the people who designed and built some of the early
computers would think it strange that we now have a loop waiting
for the user to type on the keyboard. 

In the early days, single task batch processing made more 
efficient use of the available resources.  Not so much later,
multitasking allowed one to keep a single CPU busy, though with
less efficient use of RAM. (Decreasing cost of RAM vs. CPU.)

With an FPGA, one has the ability to keep a large number of
transistors (gates) busy a large fraction of the time, if one
has a problem big enough. 

-- glen

Article: 156245
Subject: Re: Math is hard
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Tue, 21 Jan 2014 22:12:23 +0000 (UTC)
Links: << >> << T >> << A >>

In comp.arch.fpga Rob Gaddi <rgaddi@technologyhighland.invalid> wrote:
> On Tue, 21 Jan 2014 10:45:35 +0100
> David Brown <david.brown@hesbynett.no> wrote:

(snip)
>> You can't fit a cubic (or other polynomial) by itself - you need a
>> spline.  And for a curve like this, you want varying steps, with far
>> more steps at the lower range.  You might be able to divide it up using
>> the first zero as the index (i.e., ranges 1..2, 2..4, 4..8, 8..16, etc.)
>>  The lowest range - say up to 16 - is probably best handled with a
>> straight lookup table.  After that, you'd need 12 sets of coefficients.
>>  If cubic interpolation here is not good enough, try higher odd powers
>> (anchor your polynomials at the end points and the derivatives at these
>> points to get a smooth spline).

> Not sure I'm following you there, this could be a problem with my
> understanding.  When I hear people talking about spline fitting, to my
> mind that's a series of piecewise polynomial approximations.  Piecewise
> linear being the degenerate case.  Is that a reasonable statement?

Seems to me that the advantage of cubic splines is the continuous
first and second derivative. (And more derivatives for higher
order splines.)  Also, that the result goes through the supplied
points.

If you fit an n-1 degree polynomial to n points, it will go through
the points, but likely fluctuate wildly in between. A lower order
polynomial will be less wild, but won't go through the points.

> The overall range I was trying to span was integer N 1:65535.  Trying
> to fit a cubic to the range 1:128 was at attempt to see whether even
> going to 512 (linearly spaced) pieces was going to give me a decent
> approximation.  At least in the high curvature section of small N, the
> results were ghastly.

It isn't required that an N element lookup table use N linearly
spaced points, but it does simplify the logic. Consider how A-law
and u-law coding allows reasonably dynamic range for digital
telephone audio. 

-- glen

Article: 156246
Subject: Re: Math is hard
From: robert bristow-johnson <rbj@audioimagination.com>
Date: Tue, 21 Jan 2014 17:21:38 -0500
Links: << >> << T >> << A >>

On 1/21/14 5:12 PM, glen herrmannsfeldt wrote:
> In comp.arch.fpga Rob Gaddi<rgaddi@technologyhighland.invalid>  wrote:
>> On Tue, 21 Jan 2014 10:45:35 +0100
>> David Brown<david.brown@hesbynett.no>  wrote:
>
> (snip)
>>> You can't fit a cubic (or other polynomial) by itself - you need a
>>> spline.  And for a curve like this, you want varying steps, with far
>>> more steps at the lower range.  You might be able to divide it up using
>>> the first zero as the index (i.e., ranges 1..2, 2..4, 4..8, 8..16, etc.)
>>>   The lowest range - say up to 16 - is probably best handled with a
>>> straight lookup table.  After that, you'd need 12 sets of coefficients.
>>>   If cubic interpolation here is not good enough, try higher odd powers
>>> (anchor your polynomials at the end points and the derivatives at these
>>> points to get a smooth spline).
>
>> Not sure I'm following you there, this could be a problem with my
>> understanding.  When I hear people talking about spline fitting, to my
>> mind that's a series of piecewise polynomial approximations.  Piecewise
>> linear being the degenerate case.  Is that a reasonable statement?
>
> Seems to me that the advantage of cubic splines is the continuous
> first and second derivative.

*second* derivative?  how does a third-order polynomial, with 4 
coefficients, satisfy 6 constraints - 3 on each side, left and right?

i seem to remember the term "osculating" from Duane Wise.

i can understand how the Hermite polynomials do it, but i can't see how 
this additional derivative works.


-- 

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."

Article: 156247
Subject: Re: Math is hard
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Tue, 21 Jan 2014 22:27:02 +0000 (UTC)
Links: << >> << T >> << A >>

In comp.arch.fpga robert bristow-johnson <rbj@audioimagination.com> wrote:
> On 1/21/14 12:18 PM, Rob Gaddi wrote:
(snip)
>> Not sure I'm following you there, this could be a problem with my
>> understanding.  When I hear people talking about spline fitting, to my
>> mind that's a series of piecewise polynomial approximations.
 
> that's normally what i thought.  however, when i look up the math for 
> "cubic splines" it doesn't seem to be exactly the same as fitting either 
> 3rd-order Lagrange nor 3-order Hermite polynomials.  i would have 
> expected it to come out as one or the other.

Fit N-1 cubic sections to N points, you need 4N-4 parameters.
Going through the points at either end, is 2N-2. Continuous
first and second derivative at the N-2 splice points, 2N-4 more.
That leaves two more, which are the boundary conditions at the
end points. 
>>  Piecewise linear being the degenerate case.  
>> Is that a reasonable statement?
 
> who are you calling "degenerate"?!    :-)
 
> probably piecewise-constant is more degenerate.  maybe zeros inserted 
> between points is even more degenerate.  other than approximating the 
> whole thing with zero (or some other constant), i cannot think of a more 
> degenerate case.
 
(snip)

> maybe direct lookup for the first 128 and some equally-spaced polynomial 
> splines for the remaining 511 regions?

That might be the more obvious, and easiest to implement, of the
non-uniform spaced interpolation methods.

Reminds me of some years ago, someone was building a sampling
system for a signal with a long exponential decaying tail.

(And when high-speed RAM was pretty expensive.)

The idea was to sample at full speed for 10 samples, then average
groups of 10 for the next 100 (or maybe 90), then average groups
of 100 for the next. Then exponentially longer and longer
blocks after that. They would do all the averaging in logic, 
before storing into the not-too-big RAM.

I had some ideas how to build one, but they didn't ask.

-- glen

Article: 156248
Subject: Re: embedded RAM vs. registers
From: jonesandy@comcast.net
Date: Tue, 21 Jan 2014 16:27:40 -0800 (PST)
Links: << >> << T >> << A >>

Al,

Most "automatic" conversion of logic from LUTs to RAMs involves using the R=
AMs like ROMs, preloaded with constant data during configuration. Flash bas=
ed FPGAs from MicroSemi do not have the ability to preload their BRAMs duri=
ng "configuration." There is no "configuration" phase at/during startup dur=
ing which they could automatically be preloaded.=20

Furthermore, the IGLOO/ProASIC3 series only provide synchronous BRAMs with =
a clock cycle delay between address in and data out. They can be inferred f=
rom RTL, so long as your RTL includes that clock cycle delay.

If you have several identical slow speed interfaces (e.g. UARTs, SPI, I2C, =
etc.) that could happily run with an effective clock rate of a fraction of =
your system clock rate, look at C-slow optimization to reduce utilization. =
There are a few coding tricks that ease translating a single-channel module=
 into a multi-channel, C-slowed module capable of replacing multiple copies=
 of the original.=20

Retiming can be combined with C-slowing (the two are very synergystic) to e=
nable the original clock rate to be increased, recovering some of the origi=
nal per-channel performance.=20

Repipelining can be combined with C-slowing (also synergystic) to hide orig=
inal design latency, thus recovering some of the per-channel performance wi=
thout increasing the system clock rate.

Andy

Article: 156249
Subject: Re: Math is hard
From: David Brown <david.brown@hesbynett.no>
Date: Wed, 22 Jan 2014 12:38:42 +0100
Links: << >> << T >> << A >>

(I know the OP has found a better way to solve the original problem, but
I think the discussion here is still fun!)

On 21/01/14 23:12, glen herrmannsfeldt wrote:
> In comp.arch.fpga Rob Gaddi <rgaddi@technologyhighland.invalid> wrote:
>> On Tue, 21 Jan 2014 10:45:35 +0100
>> David Brown <david.brown@hesbynett.no> wrote:
> 
> (snip)
>>> You can't fit a cubic (or other polynomial) by itself - you need a
>>> spline.  And for a curve like this, you want varying steps, with far
>>> more steps at the lower range.  You might be able to divide it up using
>>> the first zero as the index (i.e., ranges 1..2, 2..4, 4..8, 8..16, etc.)

(Note that this should be "first one", not "first zero".)

>>>  The lowest range - say up to 16 - is probably best handled with a
>>> straight lookup table.  After that, you'd need 12 sets of coefficients.
>>>  If cubic interpolation here is not good enough, try higher odd powers
>>> (anchor your polynomials at the end points and the derivatives at these
>>> points to get a smooth spline).
>  
>> Not sure I'm following you there, this could be a problem with my
>> understanding.  When I hear people talking about spline fitting, to my
>> mind that's a series of piecewise polynomial approximations.  Piecewise
>> linear being the degenerate case.  Is that a reasonable statement?
> 

Ignoring other people's comments about degenerate cases, yes, that's
correct.

> Seems to me that the advantage of cubic splines is the continuous
> first and second derivative. (And more derivatives for higher
> order splines.)  Also, that the result goes through the supplied
> points.
> 
> If you fit an n-1 degree polynomial to n points, it will go through
> the points, but likely fluctuate wildly in between. A lower order
> polynomial will be less wild, but won't go through the points.

For a cubic spline, 3rd derivatives are constant - and higher order
derivatives are always 0.  This avoids the "wildness" of many higher
order polynomials - in particular, if you try to fit a high order
polynomial directly to a set of points then higher derivatives are often
very large.  Cubic splines are often chosen for being more flexible (and
curvy) than linear splines, but with more tractable maths and less
"wildness" than higher orders.

But remember that there are /many/ ways to make a cubic spline for a set
of points or for a given function.  A common way - as glen mentioned in
another post - is to make it pass through all N points and make the
first and second derivatives match up smoothly.  But there are others,
such as aiming to minimise the RMS error from many points, or making the
values and first derivatives match the target curve at the boundary
points.  Some methods involve solving huge matrices - calculating the
entire spline in one go - and others are done piecemeal.

> 
>> The overall range I was trying to span was integer N 1:65535.  Trying
>> to fit a cubic to the range 1:128 was at attempt to see whether even
>> going to 512 (linearly spaced) pieces was going to give me a decent
>> approximation.  At least in the high curvature section of small N, the
>> results were ghastly.

That is why I said you should not be using linearly spaced pieces - you
need closer sections at the higher curvature part of the table.

> 
> It isn't required that an N element lookup table use N linearly
> spaced points, but it does simplify the logic. Consider how A-law
> and u-law coding allows reasonably dynamic range for digital
> telephone audio. 
> 

Linear spacing makes the table smaller (you don't have to store "x"
values), and lookup faster (you can go directly to the right line).  My
thoughts on this function is that indexing by first one bit could give
you a compact and fast table with non-linear spacing (approximately
logarithmic spacing).

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search