Nios® V/II Embedded Design Suite (EDS)
Support for Embedded Development Tools, Processors (SoCs and Nios® V/II processor), Embedded Development Suites (EDSs), Boot and Configuration, Operating Systems, C and C++
12606 Discussions

Maximum ethernet speed available with NiosII + Linux

Altera_Forum
Honored Contributor II
2,451 Views

Hello, 

I plane to built a system with a niosII mmu and linux on an Cyclone III device. 

I will use an IP ethernet core with an dp83848 phy. 

I need to use ethernet TCP communication. 

What is the maximum speed of data throughput I could hope to get? 

What MAC IP should I use?
0 Kudos
16 Replies
Altera_Forum
Honored Contributor II
439 Views

There are already several threads om this issue in this forum (perhaps some of them, were lost in the biog crash year ago).  

 

AFAIR, the max throughput found was about 1-5 MByte/sec for a 100 MHz CPU without MMU.  

 

No idea if his will be greatly better or worse if you use an MMU.  

 

I doubt that the MAC or PHY hardware makes a big difference.  

 

-Michael
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

Thanks for answering, 

 

I have read the other thread and found the information you've give. 

This information is not realy coherent with the avaibility of Gigabit IP core. 

What is the use of such IP core? 

 

What is the real advantage of using fpga in place of CPU (i curently use ATOM processor with gigabit lan communication) if the performance are lower with an higher price: An ATOM processor + an intel 82574 is cheaper than a medium class fpga supporting GBit lan. 

 

There something I have missed?
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

In fact i'm thinking to use MMU only for supporting Linux in place of µLinux. 

 

I'm thinking to use something like a DMA controler to improve the data rate, that's why I was asking about the best MAC ip core.
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

See my thread on this here: http://www.alteraforum.com/forum/showthread.php?t=18934 

There's a link or two to other useful threads there too. 

 

So far ethernet performance with a gigabit PHY (DP83865) can best be described as disappointing. I haven't been able to get it to work correctly at gigabit speed, so my results should be comparable to the DP83848. 

 

I am getting at best 38Mbit/s UDP with 1KB messages and 30Mbit/s TCP with the following tweaks: 

- altera_tse.c (SLS) driver for the altera TSE MAC 

- 32KB processor caches 

- a bunch of network related kernel functions in SSRAM instead of DDR DRAM 

- no cache burst 

- MMU (similar results without it) 

I am also seeing some unexplained dropped packets. 

Note that at least one other person in the thread has been able to get hundreds of Mbit/s with a similar setup, though they weren't using Linux.
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

An FPGA can provide superior ethernet throughput to a processor. However, the NIOS processor is not a high-speed processor. There is essentially no way it can keep up with an atom processor with regards to performance. If the sole purpose for your product is ethernet and you need a processor, a stand-alone processor is a better fit than an FPGA. 

 

Jake
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

What is the real advantage of using fpga in place of CPU (i curently use ATOM processor with gigabit lan communication) if the performance are lower with an higher price: An ATOM processor + an intel 82574 is cheaper than a medium class fpga supporting GBit lan. 

There something I have missed? 

--- Quote End ---  

 

 

If you use just standard stuff and you find a processor with embedded peripherals that fit you needs, it's obvious that this combination can be cheaper and/or faster than an FPGA with embedded processor.  

 

The FPGA gets interesting or even necessary if you need to handle non-standard interfaces that you don't find in a standard chip, or if your goal is a long product life and you don't want to depend on the availability of certain highly specialized chips.  

 

-Michael
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

I'm thinking to use something like a DMA controler to improve the data rate, that's why I was asking about the best MAC ip core. 

--- Quote End ---  

 

AFAIK, all decent Ethernet iP-cores and the appropriate Linux drivers do use DMA. The performance is limited by the standard Linux IP-stack, that copies the frames several times. There are "zero-copy" stacks for Linux and for other OSes used by routers, but AFAIK, none is ported to NIOS, yet. 

 

-Michael
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

Thanks for your reply. 

 

Of course I need to implement more than a processor and a mac+phy. I have a specialized logic glue to implement and the FPGA is needed. 

The question for me is to chose one of those: 

1 - ATOM + Chipset + Ethernet chip + PCI Bridge + FPGA. In this case the FPGA include only specific logic (with an FPGA like EP1C3) 

2 - ATOM + Chipset + FPGA + PHY. In this case the FPGA include specific logic + PCI Bridge + Ethernet. 

3 - Only FPGA + PHY. 

 

I already use the solution 1, so I now it works. But it takes a lot of place on my PCB, componants aren't very expensive, but they are a lot and the production cost is high. 

The solution 2 is software difficult and need a FPGA with transceiver. 

The solution 3 would be the best if: 

- the fpga componant is not so expensive (less than $150) 

- network perfomance is enouth: my maximum transfert rate is 4.9MBytes/s with TCP. 

 

As ykozlov find a maximum rate of 30Mbit/s with TCP (so 3.75MBytes/s), I'm thinking that we are near the objective. 

 

What would be the impact of JUMBO frame one the data rate. If the nios processor take time to read and create ethernet/tcp header, it should change the data rate? 

Does somebody try it?
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

The performance is limited by the standard Linux IP-stack, that copies the frames several times. 

--- Quote End ---  

 

 

OK, so jumbo frame will change nothing. 

Work is to do in the software/driver part. I'm correct?
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

The iniche stack does provide a zero copy interface. 

We DMA from and adc data directly into a zero copy buffer and DMA straight out. We also use a HDL checksum component. The final improvement for us would be to do the checksum as the data is DMA'd to the TSE. It would be nice if this was an option in the TSE controller when its in store and forward mode (I'm not aware that it is). 

 

We get away with the processor having to do very little in the way of accessing the tcp packets.
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

We DMA from and adc data directly into a zero copy buffer and DMA straight out. We also use a HDL checksum component. 

--- Quote End ---  

 

 

In your case what is your best data rate?
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

OK, so jumbo frame will change nothing. 

Work is to do in the software/driver part. I'm correct? 

--- Quote End ---  

 

Software, yes. Driver, not so much. A relatively small part of the time it takes to send a packet from an application is the time spent in driver code.
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

The iniche stack does provide a zero copy interface. 

--- Quote End ---  

I don't suppose this works together with Linux system (the theme of this forum).  

 

Of course it should be possible to implement a dual NIOS system, one with niche, one with Linux.  

 

-Michael
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

I am also interested in knowing the results of how the performance changes with tcp zero_copy enabled. I am currently seeing no difference, so I am wondering if I have something else set that is disabling it. 

 

jcnhal - Any chance you can post the difference in performance you are seeing?
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

 

--- Quote Start ---  

with tcp zero_copy enabled. I am currently seeing no difference, 

--- Quote End ---  

 

How did you "enable zero copy" with Linux ?  

I decently doubt that this is easily possible. You need to (1) port a zero-copy IP stack to NIOS and (2) create a zero-copy enabled Ethernet driver vor yo9ur hardware. 

 

-Michael
0 Kudos
Altera_Forum
Honored Contributor II
439 Views

Sorry Michael, I was doing this in Windows with the 9.1 software... 

I mixed two points of conversation and shouldn't have done that for confusion's sake and been a bit clearer. I am interested in just finding out the maximum performance using TCP that anyone has been able to achieve.
0 Kudos
Reply