Messages from 117225

Article: 117225
Subject: Spartan 3E Not enough block ram.
From: "Ken Soon" <csoon@xilinx.com>
Date: Tue, 27 Mar 2007 11:06:11 +0800
Links: << >> << T >> << A >>

I trying to port a design (video scaler) from Virtex 4 to Spartan 3E.
Currently having trouble with not enough block rams.

My reference (top) design uses only 15 Block Rams, but after wrapping a
"wrapper"  module around my top design. The block Rams shoots up to 60.
Thus, I decided to go and look into this wrapper module. It instantiates
alot of this dp_bram module. So, I went to look into this dp_bram module and
found the following codes. ( I took out relevant portions of it for easier
understanding)

(Just for note: This wrapper module is the wrapper around the video scaler
for trial synthesis.  It includes two instances of the scaler, as well as a
simple bus interface for the control register inputs.)

--Information in Entity portion--
data_width           :  integer  := 8;
mem_size             :  integer  := 1920
wr_addr  :  in    std_logic_vector((LOG2_BASE(mem_size) - 1) downto 0);
rd_addr  :  in    std_logic_vector((LOG2_BASE(mem_size) - 1) downto 0);
din      :  in    std_logic_vector((data_width - 1) downto 0);


--Architecture portion--
process (wr_clk)
   begin
      if (wr_clk'event and wr_clk = '1') then
         if (ce = '1') then
            if (wr_en = '1') then
               mem_array(conv_integer('0' & wr_addr)) <= din;
            end if;
         end if;
      end if;
   end process;

   process (rd_clk)
   begin
      if (rd_clk'event and rd_clk = '1') then
         if (ce = '1') then
            if (rd_en = '1') then
               dout <= mem_array(conv_integer('0' & rd_addr));
            end if;
         end if;
      end if;
   end process;

Now I guess, this mem_array (mem_array(conv_integer('0' & wr_addr)) <= din;)
is the main culprit using the block rams for data storage, right?
Hmm, so how should I go about trying to solve this problem of having not
enough block rams. Then, having some DDR SDRAM on my board, I have tried to
learn about dram and dram controller but woah, it is a little too
overwhelming to understand (little experience with fpga here). Could anyone
please simplify the usage of dram? (or maybe it is really so not simple?
hmm)

Oh yah, I have also tried looking at Xilinx memory interface generator for
DDR SDRAM controller. Trying to figure how to use it and later how to
integrate into my design.

(gosh why cant I just have the mem_array just automatically use the
dram....)

Article: 117226
Subject: Variable delay line (was Re: shift register with distributed ram)
From: "Marty Ryba" <martin.ryba.nospam@verizon.net>
Date: Tue, 27 Mar 2007 03:09:01 GMT
Links: << >> << T >> << A >>

Thanks for the suggestions. Just to clarify, it is a clocked delay that I 
want; everything is on a clock, which functions as a sample counter as well. 
Say there is a bitstream, and I want two copies of it: one "prompt" copy and 
one "delayed" copy with the delay being a variable number of samples in some 
kind of buffer. This is easy with RAM in a GPP (pointer arithmetic), but a 
GPP is not fast enough for pipelined processing (~30 Mbps on each of 10 or 
more bit streams). During routine processing, the delay is fixed and I want 
on each clock the pipeline to shift by one. Now, on subsequent processing 
cycles, *maintaining the state of the pipeline* I may want to tap a 
different delay point.

Now, since I want more than 16 taps of delays, I see two approaches (let N 
be the total delay):

1) The Q15's are connected for cascading, the output of the (N/16)+1 SRL is 
set to the remainder of the delay, and a mux selects the output pin of the 
(N/16)+1 SRL to use as output of the block. Based on the delay value, the 
timing of the appearance of the correct sample depends on how many SRL's it 
traverses before it exits. So, delay needs to inserted to make it constant. 
I'm likely glossing over details I don't quite understand since I didn't 
code it myself, and most of this design was done 2 years ago. This is how I 
believe it's implemented right now.

2) The output of each SRL is connected to the next one's input. The first 
(N/16) of the SRL's addresses would be set to 15 (max delay), the "middle" 
one would have mod(N,16), and the others would all be set to zero. The 
block's output is connected to the output of the last SRL. This I think 
would give a consistent delay of about N+(# of SRL's). The problem is that 
it would enforce a minimum delay that I would likely have to insert into my 
"prompt" channel to balance things back out.

3) Other ideas?? For instance, I actually would prefer to be able to create 
tens of "fingers" of delay without needing separate parallel pipelines but 
maybe by having them cascade into each other. Any app notes out there that I 
haven't dug up yet? I'm revisiting this since we're restarting the program 
and have the opportunity to revamp parts of the design.

Marty Ryba
semi-mad scientist
proud member of the Luxuriant Flowing Hair Club for Scientists (no kidding!)

"Peter Alfke" <alfke@sbcglobal.net> wrote in message 
news:1174881568.648560.274310@n59g2000hsh.googlegroups.com...
> Unclocked delay lines are really not stable over temperature and
> voltage, although there are "servo' tricks to stabilize them (as done
> in the IDELAY and ODELAY Virtex I/O functions)
> For a very long clocked delay line, it might make sense to use a dual-
> ported BlockRAM.
> "Waste is often only in the eyes of the beholder..."
> Peter Alfke, Xilinx
> On Mar 25, 8:35 pm, John_H <newsgr...@johnhandwork.com> wrote:
>> Marty Ryba wrote:
>> > Slightly in another direction...is there a trick to setting up the 
>> > cascades
>> > on the SRL16s to maintain a consistent delay? We strung 8 in a row to 
>> > get an
>> > adjustable 1-bit delay line. It works, but there's a bunch of extra 
>> > muxes,
>> > etc. to get the delay consistent (3 clocks plus whatever tap I pull as
>> > output). I'm actually the systems guy and not the VHDL coder (and
>> > communication of requirements is always tricky when you're doing 
>> > something
>> > new), but I'm always interested in the mechanics to see if it can be 
>> > done
>> > better (smaller) while still meeting requirements. Especially since I'd
>> > really like a 1024 tap delay but I ran out of space (I need tens of 
>> > these,
>> > plus other DSP goodies). Suggestions on other mechanisms to use are 
>> > also
>> > welcome.

>> Shift registers are clocked.  Clocked elements don't have routing
>> consistency issues, they have routing maximum issues.  I'd suggest using
>> some Xilinx routing for combinatorial delays in an *extremely* well
>> controlled situation, inverting consecutive stages of a multi-tap delay
>> to reduce pulse width distortion.  But a 1024 element delay line?!  It
>> sounds like you need a nice, clocked delay.  SRLs in series shouldn't
>> have delay issues.
>>
>> Is it that you're taking the output from a very long clocked shift
>> register?  If so, just clock the muxed outputs to get all the SRLs to
>> show up at the output pin at a predictable time.
>>
>> Often the conceptual problem with unclocked delay lines is figuring out
>> how to get a consistent input path or a consistent output path; the
>> trouble is, both are needed.
>>
>> What is your desired range and resolution?  Acceptable jitter?
>
>

Article: 117227
Subject: No results show up after "dow" and "con" in hypertrm
From: "fouRmi" <sunrui82@gmail.com>
Date: 26 Mar 2007 20:10:54 -0700
Links: << >> << T >> << A >>

Hi, all

I am a student in Beihang University in Beijing, P.R.China. I just
started to learn to build a system uClinux on Microblaze. I m reading
a document named uClinux_ready_Microblaze_design_1_05_a.

I met a problem. My target board is just ML401, using EDK 8.1 as told
in the doc. I strictly followed the steps and everything seems going
smoothly, until the last step of downloading kernel image to the
target board. Still there are no errors both in the XMD command line
and board. There is no uClinux bootup infos in hypertrm either. I
wonder whether there is any error with the kernel image or there are
any other errors.

Maybe some of the questions are silly. Thanks very much for your
patient of reading through my e-mail. Hope u can help me.

Orlando Sun

The following is the output from the XMD.

#####################
XMD% bash dl.sh
Memory address Found
Memory Address is 0x24000000
Device xc4vlx25 Found
Architecture device is xc4vlx25
Kimage is image.bin
Kparams are
Xilinx Microprocessor Debug (XMD) Engine
Xilinx EDK 8.1.02 Build EDK_I.20.4
Copyright (c) 1995-2005 Xilinx, Inc.  All rights reserved.

XMD%
XMD% Loading MHS File..
Processor(s) in System ::

Microblaze(1) : microblaze_0
Address Map for Processor microblaze_0
  (0x00000000-0x00001fff) dlmb_cntlr    dlmb
  (0x00000000-0x00001fff) ilmb_cntlr    ilmb
  (0x22000000-0x227fffff) FLASH_2Mx32   mb_opb
  (0x24000000-0x27ffffff) DDR_SDRAM_64Mx32      mb_opb
  (0x40600000-0x4060ffff) RS232_Uart    mb_opb
  (0x41200000-0x4120ffff) opb_intc_0    mb_opb
  (0x41400000-0x4140ffff) debug_module  mb_opb
  (0x41c00000-0x41c0ffff) opb_timer_1   mb_opb

XMD%  No such Gdb Server

No Processor target with id = 0
XMD% XMD% Connecting to cable (Parallel Port - LPT1).
Checking cable driver.
 Driver windrvr6.sys version = 7.0.0.0. LPT base address = 0378h.
 ECP base address = 0778h.
Cable connection failed.
Connecting to cable (Parallel Port - LPT1).
Checking cable driver.
 Driver windrvr6.sys version = 7.0.0.0. LPT base address = 0378h.
 ECP base address = 0778h.
Cable connection failed.
Connecting to cable (Parallel Port - LPT2).
Checking cable driver.
 Driver windrvr6.sys version = 7.0.0.0.Cable connection failed.
Connecting to cable (Parallel Port - LPT2).
Checking cable driver.
 Driver windrvr6.sys version = 7.0.0.0.Cable connection failed.
Connecting to cable (Usb Port - USB22).
Checking cable driver.
 Driver xusbdfwu.sys version: 1018 (1018).
 Driver windrvr6.sys version = 7.0.0.0.Calling setinterface num=0,
alternate=0.
DeviceAttach: received and accepted attach for:
  vendor id 0x3fd, product id 0x8, device handle 0x2970038
 Max current requested during enumeration is 280 mA.
 Cable Type = 3, Revision = 0.
 Setting cable speed to 6 MHz.
Cable connection established.
Firmware version = 1018.
CPLD file version = 0006h.
CPLD version = 0012h.

JTAG chain configuration
--------------------------------------------------
Device   ID Code        IR Length    Part Name
 1       0a001093           8        System_ACE
 2       05059093          16        XCF32P
 3       0167c093          10        XC4VLX25
 4       09608093           8        xc95144xl
Assuming, Device No: 3 contains the MicroBlaze system
Connected to the JTAG MicroProcessor Debug Module (MDM)
No of processors = 1

MicroBlaze Processor 1 Configuration :
-------------------------------------
Version............................4.00.a
No of PC Breakpoints...............2
No of Read Addr/Data Watchpoints...0
No of Write Addr/Data Watchpoints..0
Instruction Cache Support..........on
Instruction Cache Base Address.....0x24000000
Instruction Cache High Address.....0x27ffffff
Data Cache Support.................on
Data Cache Base Address............0x24000000
Data Cache High Address............0x27ffffff
Exceptions  Support................off
FPU  Support.......................off
FSL DCache Support.................off
FSL ICache Support.................off
Hard Divider Support...............off
Hard Multiplier Support............on
Barrel Shifter Support.............off
MSR clr/set Instruction Support....off
Compare Instruction Support........off
JTAG MDM Connected to MicroBlaze 1
Connected to "mb" target. id = 0
Starting GDB server for "mb" target (id = 0) at TCP port no 1234
XMD% Downloading kernel image...
XMD% This will only take a few seconds!
XMD% XMD% setting kernel command line
XMD% XMD% "cmdline is "
XMD% 2
XMD% 0x100
XMD% > > > > XMD% XMD% XMD% XMD% Processor started. Type "stop" to
stop processo
r
RUNNING> Done already WOW!
RUNNING> Closing MDM communication with Processor 1
##############################

Article: 117228
Subject: Re: Variable delay line (was Re: shift register with distributed ram)
From: "Peter Alfke" <alfke@sbcglobal.net>
Date: 26 Mar 2007 20:34:32 -0700
Links: << >> << T >> << A >>


On Mar 26, 8:09 pm, "Marty Ryba" <martin.ryba.nos...@verizon.net>
wrote:
> Thanks for the suggestions. Just to clarify, it is a clocked delay that I
> want; everything is on a clock, which functions as a sample counter as well.
> Say there is a bitstream, and I want two copies of it: one "prompt" copy and
> one "delayed" copy with the delay being a variable number of samples in some
> kind of buffer. This is easy with RAM in a GPP (pointer arithmetic), but a
> GPP is not fast enough for pipelined processing (~30 Mbps on each of 10 or
> more bit streams).
Marty, forget the GPP and just wrap two counters around a BlockRAM.
That can run ten times fater than you need it. Maybe you can then do
some time-division multiplexing to save on BlockRAMs...just an idea...
Peter Alfke, from home

During routine processing, the delay is fixed and I want
> on each clock the pipeline to shift by one. Now, on subsequent processing
> cycles, *maintaining the state of the pipeline* I may want to tap a
> different delay point.
>
> Now, since I want more than 16 taps of delays, I see two approaches (let N
> be the total delay):
>
> 1) The Q15's are connected for cascading, the output of the (N/16)+1 SRL is
> set to the remainder of the delay, and a mux selects the output pin of the
> (N/16)+1 SRL to use as output of the block. Based on the delay value, the
> timing of the appearance of the correct sample depends on how many SRL's it
> traverses before it exits. So, delay needs to inserted to make it constant.
> I'm likely glossing over details I don't quite understand since I didn't
> code it myself, and most of this design was done 2 years ago. This is how I
> believe it's implemented right now.
>
> 2) The output of each SRL is connected to the next one's input. The first
> (N/16) of the SRL's addresses would be set to 15 (max delay), the "middle"
> one would have mod(N,16), and the others would all be set to zero. The
> block's output is connected to the output of the last SRL. This I think
> would give a consistent delay of about N+(# of SRL's). The problem is that
> it would enforce a minimum delay that I would likely have to insert into my
> "prompt" channel to balance things back out.
>
> 3) Other ideas?? For instance, I actually would prefer to be able to create
> tens of "fingers" of delay without needing separate parallel pipelines but
> maybe by having them cascade into each other. Any app notes out there that I
> haven't dug up yet? I'm revisiting this since we're restarting the program
> and have the opportunity to revamp parts of the design.
>
> Marty Ryba
> semi-mad scientist
> proud member of the Luxuriant Flowing Hair Club for Scientists (no kidding!)
>
> "Peter Alfke" <a...@sbcglobal.net> wrote in message
>
> news:1174881568.648560.274310@n59g2000hsh.googlegroups.com...
>
> > Unclocked delay lines are really not stable over temperature and
> > voltage, although there are "servo' tricks to stabilize them (as done
> > in the IDELAY and ODELAY Virtex I/O functions)
> > For a very long clocked delay line, it might make sense to use a dual-
> > ported BlockRAM.
> > "Waste is often only in the eyes of the beholder..."
> > Peter Alfke, Xilinx
> > On Mar 25, 8:35 pm, John_H <newsgr...@johnhandwork.com> wrote:
> >> Marty Ryba wrote:
> >> > Slightly in another direction...is there a trick to setting up the
> >> > cascades
> >> > on the SRL16s to maintain a consistent delay? We strung 8 in a row to
> >> > get an
> >> > adjustable 1-bit delay line. It works, but there's a bunch of extra
> >> > muxes,
> >> > etc. to get the delay consistent (3 clocks plus whatever tap I pull as
> >> > output). I'm actually the systems guy and not the VHDL coder (and
> >> > communication of requirements is always tricky when you're doing
> >> > something
> >> > new), but I'm always interested in the mechanics to see if it can be
> >> > done
> >> > better (smaller) while still meeting requirements. Especially since I'd
> >> > really like a 1024 tap delay but I ran out of space (I need tens of
> >> > these,
> >> > plus other DSP goodies). Suggestions on other mechanisms to use are
> >> > also
> >> > welcome.
> >> Shift registers are clocked.  Clocked elements don't have routing
> >> consistency issues, they have routing maximum issues.  I'd suggest using
> >> some Xilinx routing for combinatorial delays in an *extremely* well
> >> controlled situation, inverting consecutive stages of a multi-tap delay
> >> to reduce pulse width distortion.  But a 1024 element delay line?!  It
> >> sounds like you need a nice, clocked delay.  SRLs in series shouldn't
> >> have delay issues.
>
> >> Is it that you're taking the output from a very long clocked shift
> >> register?  If so, just clock the muxed outputs to get all the SRLs to
> >> show up at the output pin at a predictable time.
>
> >> Often the conceptual problem with unclocked delay lines is figuring out
> >> how to get a consistent input path or a consistent output path; the
> >> trouble is, both are needed.
>
> >> What is your desired range and resolution?  Acceptable jitter?

Article: 117229
Subject: Re: Solaris 10
From: navanee@gmail.com
Date: 26 Mar 2007 20:58:02 -0700
Links: << >> << T >> << A >>

On Mar 23, 1:36 pm, Michael Laajanen <michael_laaja...@yahoo.com>
wrote:
> Hi,
>
> Have anyone heard anything on Xilinx on Solaris 10 and AMD?
>
> /michael

Not sure if the following "update" is already available for Solaris
10, but it seems very promising:

http://www.sun.com/software/solaris/ds/linux_interop.jsp#3

"...In addition, in an update to the Solaris 10 OS, the Solaris Linux
Application Environment will allow users on x86 systems to take
existing, unmodified Linux binaries and run them on the Solaris
platform. This new level of interoperability will give users access to
the applications they prefer while at the same time enabling them to
reap the benefits of Solaris 10 functionality...."

--navanee

Article: 117230
Subject: Post PAR simulation for RAM Block implementations
From: "veeresh" <veeresh8210@yahoo.co.in>
Date: 26 Mar 2007 21:54:44 -0700
Links: << >> << T >> << A >>

Hello Sir,

       I'm trying to simulate a design involving Block RAM implemented
using core generator.
Please consider the example "Dual Port Block RAM v6.1" at
 http://www.xilinx.com/support/software/coregen/71i_coregen_examples.htm

I generated an empty test bench for this design. I'm able to do PAR
Simulation on virtex 2. When I tried same on virtex 4, Modelsim XE
simulator is giving error message saying two generics"en_ecc_read",
and "en_ecc_write" are not defined.

My tool set is:
Tool : ISE 7.1
Simulator : Modelsim XE III/Starter 6.0a

Can you please help me to find the reason?

Thanks and Regards,
Veeresh

Article: 117231
Subject: Post PAR simulation for RAM Block implementations
From: "veeresh" <veeresh8210@yahoo.co.in>
Date: 26 Mar 2007 21:55:31 -0700
Links: << >> << T >> << A >>

Hello Sir,

       I'm trying to simulate a design involving Block RAM implemented
using core generator.
Please consider the example "Dual Port Block RAM v6.1" at
 http://www.xilinx.com/support/software/coregen/71i_coregen_examples.htm

I generated an empty test bench for this design. I'm able to do PAR
Simulation on virtex 2. When I tried same on virtex 4, Modelsim XE
simulator is giving error message saying two generics"en_ecc_read",
and "en_ecc_write" are not defined.

My tool set is:
Tool : ISE 7.1
Simulator : Modelsim XE III/Starter 6.0a

Can you please help me to find the reason?

Thanks and Regards,
Veeresh

Article: 117232
Subject: FPGA board with multiple Ethernet connections (Gigabit Ethernet)
From: sheikh.m.farhan@gmail.com
Date: 26 Mar 2007 22:12:54 -0700
Links: << >> << T >> << A >>

Hi,
I am searching for an FPGA board having multiple gigabit ethernet
connectivity support on it. To be more precise, I need to have
multiple RJ45 connectors and associated logic (PHY) on the FPGA board
so that I can process multiple (maximum 24) ethernet streams on the
FPGA. Can anyone suggest any such board? or any add-on module which
can be hooked up on some particuar FPGA board providing 24 ethernet
connections !

Best Regards
Farhan

Article: 117233
Subject: Where is MIG 1.7???
From: "=?utf-8?B?R2FMYUt0SWtVc+KEog==?=" <taileb.mehdi@gmail.com>
Date: 26 Mar 2007 23:11:26 -0700
Links: << >> << T >> << A >>

MIG 1.7 is already cited in the new versions of DDR2 SDRAM related
application notes.
There are also a few answer records about it.
But there is no dowload link on Xilinx's site :((

Mehdi

Article: 117234
Subject: Open-source CPU-core for standard-cell ASIC?
From: "news.la.sbcglobal.net" <dontreply@nowhere.net>
Date: Tue, 27 Mar 2007 06:25:09 GMT
Links: << >> << T >> << A >>

Forgive me if this topic has been beaten to death.
Are there any *production-quality* open-source embedded CPU cores,
that are suitable for a standard-cell (0.18u) ASIC implementation?

I see lots of CPU-projects on www.opencores.org, some with obviously
amateurish documentation/legal-disclaimers ("I copied company X's
CPU, so I don't know you can legally use my core in your project, enjoy!")

In my limited search (opencores.org and basic google search), I've only
found a handful of candidates:

32-bit:
OpenRISC 1000 (from opencores.org)
Leon2/3 SPARC (www.gaisler.com)

8-bit/16-bit:
many 805x clones
a Z80 clone on www.opencores.org
various PIC micro-controller clones (of questionable legality...)

From what I can tell, Leon2/3 is the most robust candidate (SPARC V8
certified), and since it implements a well-known ISA, commercial
devtools can target it.  (Is that right?)

OpenRISC 1000 is an original RISC ISA, with gcc/gdb port.  A
few press releases suggest it's been used in commercial ASICs.

What about the 8-bit and 16-bit cores?

Article: 117235
Subject: Re: Open-source CPU-core for standard-cell ASIC?
From: "John McGrath" <tails4e@gmail.com>
Date: 26 Mar 2007 23:33:56 -0700
Links: << >> << T >> << A >>

On Mar 26, 11:25 pm, "news.la.sbcglobal.net" <dontre...@nowhere.net>
wrote:
> Forgive me if this topic has been beaten to death.
> Are there any *production-quality* open-source embedded CPU cores,
> that are suitable for a standard-cell (0.18u) ASIC implementation?
>
> I see lots of CPU-projects onwww.opencores.org, some with obviously
> amateurish documentation/legal-disclaimers ("I copied company X's
> CPU, so I don't know you can legally use my core in your project, enjoy!")
>
> In my limited search (opencores.org and basic google search), I've only
> found a handful of candidates:
>
> 32-bit:
> OpenRISC 1000 (from opencores.org)
> Leon2/3 SPARC (www.gaisler.com)
>
> 8-bit/16-bit:
> many 805x clones
> a Z80 clone onwww.opencores.org
> various PIC micro-controller clones (of questionable legality...)
>
> From what I can tell, Leon2/3 is the most robust candidate (SPARC V8
> certified), and since it implements a well-known ISA, commercial
> devtools can target it.  (Is that right?)
>
> OpenRISC 1000 is an original RISC ISA, with gcc/gdb port.  A
> few press releases suggest it's been used in commercial ASICs.
>
> What about the 8-bit and 16-bit cores?

Sun Open-Sourced the SPARC T1, with full verilog source, compiler,
simulation files, the works. I beleive the processor is quite advanced
with multiple cores and multi-threading all options. I believe the
latest version is more FPGA friendly, with lots of configurable
options. This is all from memory, but there is more info here:

http://www.opensparc.net/

Have Fun!

Article: 117236
Subject: Re: Where is Open Source for FPGA development?
From: "Daniel S." <digitalmastrmind_no_spam@hotmail.com>
Date: Tue, 27 Mar 2007 03:06:55 -0400
Links: << >> << T >> << A >>

Eric Smith wrote:
> Daniel S. wrote:
>> The ISE Navigator is a simple text editor with a few tree-views and
>> miscellaneous eye-candy but it still manages to crash a few times per
>> week.
> 
> I haven't had a crash in Navigator in perhaps 1000 hours of using various
> releases.  I have had other ISE tools abort with internal errors.

The bulk of these crashes happened when I (mostly accidentally) click ERROR 
or WARNING links that cause PN to open a browser tab... and when these tabs 
do not crash PN, I usually get no answer record for the error or warning in 
question.

>> The Schematic editor is rather plain as well yet it crashes
>> quite readily so I completely gave up on schematics after ISE 7.1.
> 
> I've never used the Schematic Editor, so I can't comment on its
> quality (or lack thereof).  It's clear that HDL support is a higher
> priority for Xilinx than schematic entry.

Back then, I used schematics only to avoid having to code boring top-level 
VHDL port maps... but the frequent crashes and other annoyances (like zero 
portability) quickly convinced me that I would be better off wrapping 
everything up in VHDL.

>> XST appears to have some severe syntax error intolerance and readily
>> crashes instead of reporting the said syntax or malformed construct
>> errors, leaving the user oblivious to the actual cause.
> 
> I used to see that in 7.x, but haven't seen it much in 8.x and 9.1.
> Maybe 7.x trained me not to do things that it didn't like.
> 
> Although it is obviously desirable for a tool to provide a good report
> of a syntax error rather than crashing, I'm much more concerned about
> ensuring that the tool does not crash for valid input.

An extra semicolumn at the wrong place can crash XST but these are often 
hard to spot since any other programming language would silently accept 
them. Over the last two or three years, I wasted over a month hunting down 
syntax-induced crashes, about a week of it last summer because XST got 
confused while processing BRAM inferences... it took me a few days to 
figure out the link between crashes and memory inferences and many more 
tweaking the inference until I got what I wanted without crashing XST or MAP.

The last time I ran into simple syntax errors I was unable to spot within 
2-3 minutes, I simply ran the thing through ModelSim's compiler... both a 
fair bit faster and much more reliable.

>> My main complaint about PAR is extremely inconsistent runtimes for a
>> given design... anywhere from 5 to 30+ minutes for one of my smallest
>> projects.
> 
> I haven't experienced that much variation; I've only seen about a 2:1
> range.  However, I'm not convinced that expecting consistent timing
> is realistic; a small change to your input may cause a siginficant
> difference in the routing difficulty, especially if you have tight
> timing constraints.

I have two ~15% designs targeted to the same 2VP30, one has ~100ps slack on 
a 5ns timespec while the other has ~1ns slack on a 10ns timespec. In both 
cases I get similarly volatile 5-30 minutes PAR runtimes even if I simply 
"Rerun All" - the only thing that changes between runs in this case is 
random initialization values within the PAR algorithms.

Thankfully, computers get faster and the C2D-E6850 is only a few months 
away... that would be a substantial improvement over my current P4-3G.

Article: 117237
Subject: Re: Where is MIG 1.7???
From: "Helmut" <helmut.leonhardt@gmail.com>
Date: 27 Mar 2007 00:31:26 -0700
Links: << >> << T >> << A >>

It think it should come with an IP update. But I=B4m excited for this
update to come.

Article: 117238
Subject: Re: RISC implementation questions
From: Andreas Hofmann <ahnews@gmx.net>
Date: Tue, 27 Mar 2007 09:52:23 +0200
Links: << >> << T >> << A >>

Patrick schrieb:
> 2) How is it working with a NOP instruction? Does there the alu
> "execute" for example a R0 = R0 + R0. As R0 is always zero this doesnt
> have any effect. Or is there somehow an additional signal from the
> decode stage that tells the alu to do nothing?

That depends on your architecture. If you use flags like ZERO or CARRY
which are set on every ALU operation coding NOP as "ADD r0, r0, r0"
might not be a good idea.

Otherwise, as your register r0 is read-only, you can do this and get
your NOP for free in terms of required opcodes. Likewise you can emulate
register moves with "ADD r_dest, r_source, r0".

If you can encode your NOP instruction as "0...0" life will be easier as
 internal FPGA memories cells are typically set to 0 on configuration.

Best regards
Andreas

Article: 117239
Subject: Re: Open-source CPU-core for standard-cell ASIC?
From: Jim Granville <no.spam@designtools.maps.co.nz>
Date: Tue, 27 Mar 2007 19:57:42 +1200
Links: << >> << T >> << A >>

news.la.sbcglobal.net wrote:

> Forgive me if this topic has been beaten to death.
> Are there any *production-quality* open-source embedded CPU cores,
> that are suitable for a standard-cell (0.18u) ASIC implementation?
> 
> I see lots of CPU-projects on www.opencores.org, some with obviously
> amateurish documentation/legal-disclaimers ("I copied company X's
> CPU, so I don't know you can legally use my core in your project, enjoy!")
> 
> In my limited search (opencores.org and basic google search), I've only
> found a handful of candidates:
> 
> 32-bit:
> OpenRISC 1000 (from opencores.org)
> Leon2/3 SPARC (www.gaisler.com)
> 
> 8-bit/16-bit:
> many 805x clones
> a Z80 clone on www.opencores.org
> various PIC micro-controller clones (of questionable legality...)
> 
> From what I can tell, Leon2/3 is the most robust candidate (SPARC V8
> certified), and since it implements a well-known ISA, commercial
> devtools can target it.  (Is that right?)
> 
> OpenRISC 1000 is an original RISC ISA, with gcc/gdb port.  A
> few press releases suggest it's been used in commercial ASICs.
> 
> What about the 8-bit and 16-bit cores? 

You don't say what you need, but did you look at the Mico32 from Lattice?

That is opensource, and proven on their silicon, and others have 
compiled it onto X and A.

Maybe someone can give numbers for Mico32 on the Cyclone III  ?

-jg

Article: 117240
Subject: Re: Open-source CPU-core for standard-cell ASIC?
From: Jim Granville <no.spam@designtools.maps.co.nz>
Date: Tue, 27 Mar 2007 20:21:57 +1200
Links: << >> << T >> << A >>

Jim Granville wrote:

> news.la.sbcglobal.net wrote:
>>
>> What about the 8-bit and 16-bit cores? 

I should also mention pacoblaze
http://bleyer.org/pacoblaze/

and also the Mico8 from Lattice

and Eric5 (not open source, but is small and supported)
http://www.entner-electronics.com/index_eng.html


-jg

Article: 117241
Subject: Re: how to read a sequence of video
From: "kha_vhdl" <abaidik@gmail.com>
Date: 27 Mar 2007 02:14:39 -0700
Links: << >> << T >> << A >>

On 27 mar, 03:38, Eric Smith <e...@brouhaha.com> wrote:
> "kha_vhdl" <abai...@gmail.com> writes:
> > please i want some informations about reading videos using VHDL : how
> > to input videos then to treat it?
>
> It's rumored that the VHDL 2009 standard will include a standard package
> for video processing, so you're probably best off waiting for that.

hi for all,
first of all , i m sorry for my stupid question , very sorry , but if
i m stupid please try to let me at the right way .
I try to programm a coder  video it contains many modules and i want
to know how can i read my video at the first time .
for an information till now i dont have any information about my video
( my teacher didnt tell me these details) and for me as a beginner i
want know the different ways to read it ; till now what i know two
ways of inputing manual and automatic one .
Really these are the informations that i know

Article: 117242
Subject: CycloneII altlvds_rx
From: "Dolphin" <Karel.Deprez@gemidis.be>
Date: 27 Mar 2007 04:59:44 -0700
Links: << >> << T >> << A >>

Hello,

I am using the altlvds_rx core of altera to make a video deserializer
in an FPGA. I have implemented the design and it is now running on my
board. However, it looks like I have a timing problem. Recently I
noticed that specifying a Tsu and Tho for the LVDS pairs changes the
behavior. Has anybody used the altLVDS core and can you tell me if it
is necessary to add constarints for this core:
- tSU and tHO? To which clock should these values be referenced?
- Fast input registers?

thanks and best regards,
Karel

Article: 117243
Subject: Re: Open-source CPU-core for standard-cell ASIC?
From: Colin Paul Gloster <Colin_Paul_Gloster@ACM.org>
Date: 27 Mar 2007 12:56:40 GMT
Links: << >> << T >> << A >>

Someone asked:
"[..]

I see lots of CPU-projects on www.opencores.org, some with obviously
amateurish documentation/legal-disclaimers ("I copied company X's
CPU, so I don't know you can legally use my core in your project, enjoy!")

[..]
Leon2/3 SPARC (www.gaisler.com)

[..]
a Z80 clone on www.opencores.org
[..] (of questionable legality...)"


You missed the Z80 clone for an Amstrad clone on WWW.Symbos.De/trex.htm


"From what I can tell, Leon2/3 is the most robust candidate (SPARC V8
certified),"

Have you ever read the many problems people talk about on one of its
Yahoo! email lists? Many are due to people not reading the
documentation, but not all.


" and since it implements a well-known ISA, commercial
devtools can target it.  (Is that right?)

[..]"

Very expensive commercial tools do legally target Leon processors.

Article: 117244
Subject: Re: help needed
From: "Paul" <pauljbennett@gmail.com>
Date: 27 Mar 2007 06:01:59 -0700
Links: << >> << T >> << A >>

"each state execute some vhdl code"

   Not all VHDL is synthesizable... in fact, about 70% of the language
is NOT synthesizable.   In general "each state execute some vhdl code"
makes it sound like you think your FPGA will run like software - it
will not.  You are designing a piece of hardware that runs
CONCURRENTLY....  things that happen sequentially refers to HOW your
hardware operates - the synthesize doesn't even know that part exists,
it is trying to design concurrent hardware per your code.  It is
possible to decode your state and use that as an enable to some piece
of hardware (In general, that's how to make things occur sequentially
in an FPGA), but it is impossible to make the state machine "trigger
some piece of VHDL"... that piece of VHDL, if sythesizable, defines a
piece of concurrent hardware that is ALWAYS there.  If your simulation
really operates like you described, chances are you've written in some
non-synthesizable constructs.

   Seriously, not trying to bust your balls or anything - just
speaking as a recent student myself.  Go take a basic digital design
course BEFORE you try to take that fancy FPGA Lab course.  If you
don't understand basic gates, registers, etc... you're never going to
properly understand how synthesis works.

   Good luck in your schooling

On Mar 25, 4:25 pm, djo...@btinternet.com wrote:
> Dear Sir
>
> I am using Quartus by ALtera, i have made my design using VHDL and
> have ran it on model sim, the basic design use a state machine that
> depends on a clock. each state execute some vhdl code. by when i try
> synathise it i get warnings such as
>
> Warning: Can't achieve minimum setup and hold requirement CLK_H along
> 1922 path(s). See Report window for details.
>
> -17.200 ns brain:I0|WRITE:I0|counter[4] rammomb:I1|
> alt3pram:alt3pram_component|altdpram:altdpram_component2|q[10]~reg_wa4
> CLK_H CLK_H 0.000 ns 19.000 ns 1.800 ns
>
> please guide me on what i need to do, the design has a counter 5-bit
> which is basically the inputted to the RAM address input.
>
> Regards
>
> Dharmesh Joshi

Article: 117245
Subject: Re: A suggestion for a new input interface for functions in VHDL: XOR(a0, a1, ...)
From: "Andy" <jonesandy@comcast.net>
Date: 27 Mar 2007 06:06:19 -0700
Links: << >> << T >> << A >>

I think Synopsys has problems with not knowing the type of the
expression (A'range => ASel). Other vendors seem to be able to figure
it out, but I've never tracked down whether it is legal per LRM.

Weng, would you rather have to type all that garbage out, or just
write a function:

...
temp := '0';
for i in x'range loop
  temp := temp xor x(i);
end loop;
return temp;

Andy

On Mar 26, 8:54 pm, Jim Lewis <j...@synthworks.com> wrote:
> Good Evening Mr. Tianxiang,
>
> > Hi Lewis,
> > I have a suggestion on VHDL function interface.
>
> > Here is a point: ('-' is used to simplify the 'downto')
> > R(63-0) <= a0*b0(63-0) + a1*b1(63-0) + ... + an*bn(63-0);
>
> How do your propose to handle the case where someone
> wrote an expression to get the bit one to the left of
> the leftmost bit:
>
>     Y <= A(A'left - 1) ;
>
> Golden Rule:
> You can't change the language in a way that breaks code that is
> currently valid.
>
> Also note, in VHDL, "*" is multiply and "+" is add.  Is that
> what you mean or are using it as a short hand for "and" and "or".
>
> > To finish the output data bus, I have to add function:
> > BitAndVector(a, b), then the above equation becomes:
>
> > R(63-0) <= BitAndVector(a0, b0) + BitAndVector(a1, b1) + ... +
> > BitAndVector(an, bn);
>
> > If we have a new interface like this:
> > BitAndVectorThenOR(a, b, ...);
>
> > The above function can be called like this:
> > R(63-0) <= BitAndVectorThenOR(a0, b0, a1, b1, a2, b2);
> > or
> > R(63-0) <= BitAndVectorThenOR(a0, b0, a1, b1, a2, b2, a3, b3, a4, b3);
>
> Is this the same AND-OR logic we talked about when you wrote
> your conference paper?  The overloaded "and" function I showed
> you then has been integrated into the language with Accellera 3.0 draft
> of VHDL.  For the time being, you can use the following package:
>
> library ieee ;
> use ieee.std_logic_1164.all ;
>
> package TempPkg is
>
>    ------------------------------------------------------------
>    function "and" (
>      l : std_logic_vector ;
>      r : std_logic
>    ) return std_logic_vector ;
>
>    ------------------------------------------------------------
>    function "and" (
>      l : std_logic ;
>      r : std_logic_vector
>    ) return std_logic_vector ;
>
> end TempPkg ;
>
> -- ==============================================================
> package body TempPkg is
>
>    ------------------------------------------------------------
>    function "and" (
>      l : std_logic_vector ;
>      r : std_logic
>    ) return std_logic_vector is
>      variable result : std_logic_vector(l'range) ;
>    begin
>      for i in  l'range loop
>        result(i) := l(i) and r ;
>      end loop ;
>      return result ;
>    end ; -- "and"
>
>    ------------------------------------------------------------
>    function "and" (
>      l : std_logic ;
>      r : std_logic_vector
>    ) return std_logic_vector is
>      variable result : std_logic_vector(r'range) ;
>    begin
>      for i in  r'range loop
>        result(i) := r(i) and l ;
>      end loop ;
>      return result ;
>    end ; -- "and"
> end TempPkg ;
>
> With this package, you can write your equations as:
>    R(63 downto 0) <= (a0 and b0) or (a1 and b1) or ... or
>                      (an and bn);
>
> Hey, I even saved you enough typing that you don't have to be
> overly concerned about having to use "downto".
>
> Alternately you can use the following:
>      signal Y, A, B : std_logic_vector(7 downto 0) ;
>
>      Y <=
>          (A and (A'range => ASel)) or
>          (B and (B'range => BSel)) ;
>
> Note I have portability concerns with the above code as at one point in
> time Synopsys did not support it.  If they still do not support it,
> and you use their tools, report it as a bug (if you need help
> convincing it is a bug, drop me an email).
>
>
>
> > Another example:
> > Function XOR(a0, a1, ...);
>
> > The following calls are all valid:
>
> > XOR(a0, a1, a2);
>
> > XOR(a0, a1, a2, a3, a4, a5);
>
> > The compiler will check if a0, a1, ... are the same type of inputs.
> > Any number of input signals more than 1 are allowed and the function
> > will do the specified XOR operation and so on.
>
> > Here is an example of my code to generate a XOR equation from 32 input
> > signals:
> > y_xor(0) <= (x(1) xor  x(2) xor  x(3) xor  x(5) xor  x(8) xor  x(9)
> > xor  x(11) xor x(14) xor x(17) xor x(18) xor x(19) xor x(21) xor x(24)
> > xor x(25) xor x(27) xor x(30) xor x(32) xor x(36) xor x(38) xor x(39)
> > xor x(42) xor x(44) xor x(45) xor x(47) xor x(48) xor x(52) xor x(54)
> > xor x(55) xor x(58) xor x(60) xor x(61) xor x(63));
>
> > If there is a new function interface:
> > y_xor(0) <= XOR(x(1), x(2), x(3), x(5), x(8), x(9), x(11), x(14),
> > x(17), x(18), x(19), x(21), x(24), x(25), x(27), x(30), x(32), x(36),
> > x(38), x(39), x(42), x(44), x(45), x(47), x(48), x(52), x(54), x(55),
> > x(58), x(60), x(61), x(63));
>
> Write yourself a function that accepts std_logic_vector as an input and
> add an extra set of parentheses to the call:
>    y_xor(0) <= XOR(( x(1), x(2), x(3), x(5), x(8), x(9), x(11), x(14),
>                      x(17), x(18), x(19), x(21), x(24), x(25), x(27), x(30), x(32), x(36),
>                      x(38), x(39), x(42), x(44), x(45), x(47), x(48), x(52), x(54), x(55),
>                      x(58), x(60), x(61), x(63) ));
>
> Now the function sees one input argument that is an array aggregate :).
>
> Good Luck to You,
> Jim Lewis
> SynthWorks VHDL Training

Article: 117246
Subject: Re: CycloneII altlvds_rx
From: "Rob" <robnstef@frontiernet.net>
Date: 27 Mar 2007 06:09:26 -0700
Links: << >> << T >> << A >>

Karel,

Yes, I've used the this MegaWizard core many times, I've even built my
own.  If the skew (board + cable) & jitter (transmitter) between the
clock and data lanes is tight you typically don't have to do anything
with setup and hold times--the MegaWizard takes care of all the
important timing under the hood.  What is your 1x clock frequency and
deserialization factor?

What do you mean when you say "it looks like I have a timing
problem?".  Does the transmitter have the ability to send a test
pattern?  I've found the best way to test these types of interfaces is
to use a known pattern.  Debugging timing on this interface with a
unknown changing input will cause you to pull your hair out.

Rob

On Mar 27, 7:59 am, "Dolphin" <Karel.Dep...@gemidis.be> wrote:
> Hello,
>
> I am using the altlvds_rx core of altera to make a video deserializer
> in an FPGA. I have implemented the design and it is now running on my
> board. However, it looks like I have a timing problem. Recently I
> noticed that specifying a Tsu and Tho for the LVDS pairs changes the
> behavior. Has anybody used the altLVDS core and can you tell me if it
> is necessary to add constarints for this core:
> - tSU and tHO? To which clock should these values be referenced?
> - Fast input registers?
>
> thanks and best regards,
> Karel

Article: 117247
Subject: Re: Spartan 3E Not enough block ram.
From: "Paul" <pauljbennett@gmail.com>
Date: 27 Mar 2007 06:16:59 -0700
Links: << >> << T >> << A >>

As I think has been answered on here before, DRAM has a complex
sequence of commands that must be issued to it, there is no way to
"just know how to use it".  An interface will probably take in the
order of 500-1000 slices in your part.  If this is an eval board, I'd
be surprised if the stuff the board came with didnt have SOME sort of
a dram interface in the examples.... but I don't know.

Yes, the construct you pointed out is most likely what is inferring
the block rams.  To confirm that, look at the "language templates"
section of ISE.  It will show you exactlly the constructs that infer
different pieces of hardware.

It sounds like the only way to do what you want is to use the external
ram.  Which means finding or designing a DRAM interface.  You will
also, however, need to redesign the hardware AROUND the memory, since
it will not operate as fast or as smoothly as the block rams.  Most
likely, the DRAM interface will be buffered by a FIFO to the remainder
of the FPGA.  You will have worse latency and bandwidth as compared to
block ram.

Sorry for the bad news... but it doesnt sound like this is a cut and
paste.  You're gonna have to teach yourself a little bit about
hardware and vhdl and fpgas and then you're gonna have to do some
actual design - go figure.

On Mar 26, 11:06 pm, "Ken Soon" <c...@xilinx.com> wrote:
> I trying to port a design (video scaler) from Virtex 4 to Spartan 3E.
> Currently having trouble with not enough block rams.
>
> My reference (top) design uses only 15 Block Rams, but after wrapping a
> "wrapper"  module around my top design. The block Rams shoots up to 60.
> Thus, I decided to go and look into this wrapper module. It instantiates
> alot of this dp_bram module. So, I went to look into this dp_bram module and
> found the following codes. ( I took out relevant portions of it for easier
> understanding)
>
> (Just for note: This wrapper module is the wrapper around the video scaler
> for trial synthesis.  It includes two instances of the scaler, as well as a
> simple bus interface for the control register inputs.)
>
> --Information in Entity portion--
> data_width           :  integer  := 8;
> mem_size             :  integer  := 1920
> wr_addr  :  in    std_logic_vector((LOG2_BASE(mem_size) - 1) downto 0);
> rd_addr  :  in    std_logic_vector((LOG2_BASE(mem_size) - 1) downto 0);
> din      :  in    std_logic_vector((data_width - 1) downto 0);
>
> --Architecture portion--
> process (wr_clk)
>    begin
>       if (wr_clk'event and wr_clk = '1') then
>          if (ce = '1') then
>             if (wr_en = '1') then
>                mem_array(conv_integer('0' & wr_addr)) <= din;
>             end if;
>          end if;
>       end if;
>    end process;
>
>    process (rd_clk)
>    begin
>       if (rd_clk'event and rd_clk = '1') then
>          if (ce = '1') then
>             if (rd_en = '1') then
>                dout <= mem_array(conv_integer('0' & rd_addr));
>             end if;
>          end if;
>       end if;
>    end process;
>
> Now I guess, this mem_array (mem_array(conv_integer('0' & wr_addr)) <= din;)
> is the main culprit using the block rams for data storage, right?
> Hmm, so how should I go about trying to solve this problem of having not
> enough block rams. Then, having some DDR SDRAM on my board, I have tried to
> learn about dram and dram controller but woah, it is a little too
> overwhelming to understand (little experience with fpga here). Could anyone
> please simplify the usage of dram? (or maybe it is really so not simple?
> hmm)
>
> Oh yah, I have also tried looking at Xilinx memory interface generator for
> DDR SDRAM controller. Trying to figure how to use it and later how to
> integrate into my design.
>
> (gosh why cant I just have the mem_array just automatically use the
> dram....)

Article: 117248
Subject: Re: how to read a sequence of video
From: "Guenter" <GHEDWHCVEAIS@spammotel.com>
Date: 27 Mar 2007 06:21:11 -0700
Links: << >> << T >> << A >>

On Mar 27, 11:14 am, "kha_vhdl" <abai...@gmail.com> wrote:
[...]
> hi for all,
> first of all , i m sorry for my stupid question , very sorry , but if
> i m stupid please try to let me at the right way .

The right way is to go to your teacher and ask him. How do you expect
someone here to read the mind of your teacher, if you are not able to
do it?

Cheers,

Guenter

Article: 117249
Subject: PCI-Express drivers with Xilinx FPGA?
From: "eromlignod" <eromlignod@aol.com>
Date: 27 Mar 2007 06:36:06 -0700
Links: << >> << T >> << A >>

Hi guys:

First of all, forgive me for being a dumb mechanical engineer...FPGA's
are new to me.

I'm programming a Xilinx Spartan-3 FPGA development kit.  I'm doing
pretty well so far.  I've gotten my huge Verilog program written and
finally got it to simulate correctly and set up all the I/O pins,
etc.  Now all I have to do is download it to the chip.

My problem is that the board communicates via a PCI-Express slot.  I
shut down my computer, plug the Xilinx board into my PCI-x slot, then
turn the computer back on.  Unfortunately, the computer then insists
that I provide it with some sort of software or drivers to recognize
the new hardware (which it identifies as a "coprocessor").

Well, as far as I can tell, they never gave me these drivers or
software (and I wouldn't know them if I found them anyway) and finding
any answers on their website has proven to be extremely difficult.
They have a forum, but it is not very active (only a post every day or
so).

Has anyone here ever dealt with this kit?  And, if so, do you know how
I can get hooked up?  I'm thinking it's probably something
ridiculously simple that I'm not grasping right now.  Thanks for any
help.

Don

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search