ATSTK1000,ATSTK1000 pdf中文资料,ATSTK1000引脚图,ATSTK1000电路-Datasheet-电子工程世界

文档预览

AVR32705: AVR32AP7 Networking Performance

Features

•

Linux TCP and UDP performance measurements

ATSTK1000 (32-bit SDRAM bus width)

ATNGW100 (16-bit SDRAM bus width)

•

Improved RX buffer management in the Linux MACB driver

32-bit

Microcontrollers

Application Note

1 Introduction

This application note documents the TCP/IP and UDP/IP performance of the

ATSTK

1000 and ATNGW100 development boards running the 2.6.23-rc7 version

of the Linux

kernel. It also describes three optimizations of the MACB driver’s

receive path and documents how they impact the performance.Details on the

optimization techniques applied to the MACB driver for the Linux kernel is

described in this document. The result of this optimization work is available through

Atmel

’s Linux Kernel through Atmel’s Linux Support webpages.

Prior knowledge to Linux is not required to understand the optimization techniques,

although preferred. Development Tools (ATSTK1000 and ATNGW100) for the

AVR32AP7 micro-controllers are available from Atmel.

Rev. 32066A-AVR32-02/08

2 Benchmarking tools

2.1 Iperf: The TCP/UDP Bandwidth Measurement Tool

Iperf is a TCP and UDP bandwidth measurement tool [1]. It can measure the

maximum TCP bandwidth as well as UDP data loss at a given bandwidth, making it

very suitable for measuring the impact of optimizations across the whole TCP/IP

stack. In this application note, it is used to measure the performance of the low-level

MACB Ethernet driver in particular.

The most basic use of iperf involves running one instance in server mode and one

instance in client mode, sending packets from the client to the server. Thus, to

measure how the target’s TCP/IP stack performs on the receive (RX) side, first start

iperf in server mode on the target:

iperf –s

Then, run iperf in client mode on the development host:

iperf –c 192.168.1.2

Replace “192.168.1.2” with the target’s actual IP address.

By default, the iperf client will generate packets as fast as possible for 10 seconds

and report the average bandwidth. For details about how to customize this, please

see the iperf documentation.

To measure the target’s transmit (TX) performance, simply do the above procedure

the other way around: Start iperf in server mode on the development host, then start it

in client mode on the target with the host’s IP address as a parameter.

3 Linux MACB driver improvements

The transmit and receive paths of the MACB ethernet driver have stayed mostly the

same since the initial version distributed with the STK1000 BSP 1.0. This section

describes three optimizations that can be done to improve the performance of the

receive path:

1. Store the first fragment of each packet into RAM with a two-byte offset.

2. Use non-coherent memory for the RX buffers.

3. Pass the RX buffers up the stack without copying them into a linear buffer first.

3.1 Linux socket buffers (skbuff)

The Linux kernel stores all information related to a network packet in a structure

called “skbuff”, or SKB for short [2]. The raw packet data can be stored in a linear

data area in the SKB itself, but the SKB also contains a list of “fragments” where

additional data can be passed. Each fragment is passed as a physical page reference

along with offset and size information indicating which part of the page contains the

data. The structure is declared in include/linux/skbuff.h and looks like this:

struct skb_frag_struct {

struct page *page;

__u32 page_offset;

__u32 size;

};

AVR32705

32066A-AVR32-02/08

AVR32705

As the Linux networking stack processes the incoming packets, data is copied from

the fragments into the linear data area as needed. The code parsing the Ethernet

header, however, does not do this, so a minimum of 14 bytes must be copied into the

linear data area by the driver.

3.2 Optimization 1: Offset the first RX buffer by two bytes

An Ethernet header is 14 bytes long. This is not a multiple of 4, so if the packet data

is word-aligned initially, whatever comes after the Ethernet header will not be,

resulting in poor performance. The existing MACB driver solves this by adding two

bytes of padding to the SKB data area, making the Ethernet payload properly word-

aligned. However, since the DMA buffer does not have the same padding, the source

and destination buffers will be aligned differently when the data is copied, resulting in

poor performance of the copy operation itself.

The MACB controller includes a RBOFF field in the NCFG register which specifies the

amount of padding to be added before a packet is stored in RAM. For packets that

span multiple buffers, this padding is only added to the first buffer. By specifying 2

bytes of padding in this field and adjusting the RX copying code accordingly, a minor

performance increase is expected due to faster memcpy performance.

Even though this optimization may not increase performance much by itself, it is a

prerequisite for the last optimization where the buffers are passed up the stack as-is.

3.3 Optimization 2: Use non-coherent memory as RX buffers

The MACB driver currently uses coherent, i.e. non-cacheable, memory to store the

RX data. This simplifies the code a bit because there’s no need for any cache

synchronization with such buffers, but it may also give suboptimal performance. By

mapping the RX DMA buffers in non-coherent, cacheable memory, the CPU will be

able to do burst accesses to transfer data from memory into the data cache.

The second patch replaces the coherent DMA buffers with regular pages obtained

from the page allocator. Since each MACB receive buffer is 128 bytes and the default

page size is 4096 bytes, each of these pages map 32 RX buffers.

Using single pages as receive buffers is also a prerequisite for the third optimization

since the SKB fragment list consists of a list of single pages.

3.4 Optimization 3: Avoid copying of fragments into the linear data area

When one or more packets have been received, the current incarnation of the MACB

driver scans the receive DMA descriptors and copies all the 128-byte DMA buffers

that make up a packet into the linear data area of the SKB. The last optimization aims

to avoid this copying by passing one or more references to the DMA buffers

themselves up the stack.

Since Optimization 2 already replaced the single contiguous, coherent DMA buffer

with multiple single pages, this becomes relatively straightforward. An SKB is

allocated with just enough room for the Ethernet header plus the alignment padding.

Then, each page that covers one or more of the receive buffers is added to the

fragment list, after having its reference count incremented using

get_page()

. Finally,

the Ethernet header is copied into the linear data area and the page offset and size of

the first fragment is adjusted so that it doesn’t cover the Ethernet header anymore.

When the SKB isn’t needed anymore,

kfree_skb

() is called, freeing the linear data

buffer and decrementing the reference count of all the pages in the fragment list using

32066A-AVR32-02/08

put_page

(). Eventually, when a page has been freed from all the fragment lists it was

on and it is no longer used as a DMA buffer, the reference count drops to zero and

the page is freed.

4 Results

The iperf measurement results are shown in Table 3-1. The measurements were

obtained as follows.

TCP RX bandwidth: “iperf –s” on the target and “iperf –c 192.168.1.231” on the

host.

•

TCP TX bandwidth: “iperf –s” on the host and “iperf –c 192.168.1.21” on the

target.

•

UDP RX datagram loss: “iperf –s –u” on the target and “iperf –c 192.168.1.231 –u

–b 60000000” on the host.

•

UDP TX datagram loss: “iperf –s –u” on the host and “iperf –c 192.168.1.21 –u –b

60000000” on the target.

Table 3-1. Iperf measurement results

TCP Bandwidth (Mbit/s)

Board

Kernel

2.6.23-rc7

+ RX offset

+ noncoherent RX

ATSTK1000

+ avoid RX copy

2.6.23-rc6

+ RX offset

+ noncoherent RX

ATNGW100

+ avoid RX copy

54.6

54.5

58.8

66.1

43.6

47.6

50.7

69.1

71.3

70.3

65.5

56.3

55.3

55.7

50.9

UDP Datagram Loss @ 60 Mbps

14%

17%

0.039%

44%

50%

46%

19%

•

In the end, the three optimizations combined result in a 21% bandwidth increase on

the STK1000 and a 16.3% bandwidth increase on the NGW100.

5 Further improvements

Some of the results were a bit surprising. Since the NGW100 has lower memory

bandwidth than the STK1000 (16- vs 32-bit SDRAM databus), it is very strange that it

apparently had less benefit from avoiding the copy operation. Also, the TCP TX

bandwidth became lower on both the STK

1000 and the NGW100 when the “avoid

RX copy” patch was applied.

Unless this is pure noise, it seems to indicate that the new driver handles certain

corner cases less than optimally, and there might be more room for improvement in

the driver.

6 References

Information about AT32AP7 is available on

http://www.atmel.com/avr32/

1. Iperf website:

http://dast.nlanr.net/Projects/Iperf/

2. How SKBs work:

http://vger.kernel.org/~davem/skb.html

AVR32705

32066A-AVR32-02/08

Disclaimer

Headquarters

Atmel Corporation

2325 Orchard Parkway

San Jose, CA 95131

USA

Tel: 1(408) 441-0311

Fax: 1(408) 487-2600

International

Atmel Asia

Room 1219

Chinachem Golden Plaza

77 Mody Road Tsimshatsui

East Kowloon

Hong Kong

Tel: (852) 2721-9778

Fax: (852) 2722-1369

Atmel Europe

Le Krebs

8, Rue Jean-Pierre Timbaud

BP 309

78054 Saint-Quentin-en-

Yvelines Cedex

France

Tel: (33) 1-30-60-70-00

Fax: (33) 1-30-60-71-11

Atmel Japan

9F, Tonetsu Shinkawa Bldg.

1-24-8 Shinkawa

Chuo-ku, Tokyo 104-0033

Japan

Tel: (81) 3-3523-3551

Fax: (81) 3-3523-7581

Product Contact

Web Site

www.atmel.com

Technical Support

avr32@atmel.com

Sales Contact

www.atmel.com/contacts

Literature Request

www.atmel.com/literature

Disclaimer:

The information in this document is provided in connection with Atmel products. No license, express or implied, by estoppel or otherwise, to any

intellectual property right is granted by this document or in connection with the sale of Atmel products.

EXCEPT AS SET FORTH IN ATMEL’S TERMS AND

CONDITIONS OF SALE LOCATED ON ATMEL’S WEB SITE, ATMEL ASSUMES NO LIABILITY WHATSOEVER AND DISCLAIMS ANY EXPRESS, IMPLIED

OR STATUTORY WARRANTY RELATING TO ITS PRODUCTS INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTY OF MERCHANTABILITY,

FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. IN NO EVENT SHALL ATMEL BE LIABLE FOR ANY DIRECT, INDIRECT,

CONSEQUENTIAL, PUNITIVE, SPECIAL OR INCIDENTAL DAMAGES (INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF PROFITS,

BUSINESS INTERRUPTION, OR LOSS OF INFORMATION) ARISING OUT OF THE USE OR INABILITY TO USE THIS DOCUMENT, EVEN IF ATMEL HAS

BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

Atmel makes no representations or warranties with respect to the accuracy or completeness of the

contents of this document and reserves the right to make changes to specifications and product descriptions at any time without notice. Atmel does not make any

commitment to update the information contained herein. Unless specifically provided otherwise, Atmel products are not suitable for, and shall not be used in,

automotive applications. Atmel’s products are not intended, authorized, or warranted for use as components in applications intended to support or sustain life.

Atmel®, logo and combinations thereof, AVR®, STK® and others, are the registered

trademarks or trademarks of Atmel Corporation or its subsidiaries. Other terms and product names may be trademarks of others.

32066A-AVR32-02/08

【玄铁杯第三届RISC-V应用创新大赛】LicheePi 4A 有运动识别的家庭门前安防模块: 有运动识别的家庭门前安防模块作者：六尘不改一.项目背景在社会...; 六尘不改玄铁RISC-V活动专区

问各位大神个FLASH问题: 请问FLASH中装代码的段和装数据的段应该是分开的把，那我怎么知道哪些段可以用？问各位大神个FLA...; lly8941276 微控制器 MCU

【雅特力AT-START-F437评测】2.QSPI Flash读写测试: QSPI 控制器包含一个 Command（命令）端口，理论最大支持 4GB 地址空...; caizhiwei 国产芯片交流

（转贴）国标字库的使用: 国标字库显示 http://hi.baidu.com/liangsir168/blog/ite...; zhouning201 单片机

现转手一块全新的stm32l496 discovery 开发板: 开发板具体介绍详见官网： http://www.stmicroelectronics.com.cn/...; Mr.juno 淘e淘

【2024 DigiKey创意大赛】基于ESP32 S3的智能骑行头盔: 一、项目概述本项目旨在设计一款集成了多种功能的智能头盔，以满足骑行者在导航、安全...; reayfei DigiKey得捷技术专区

分销商的世界，大者恒大

作为一家本土分销商，北高智一直都在以一个平稳而高效的速度增长，即使在今年这样不景气的大环境下，北高智2012年的业绩达到4亿美金，相比2011年有30%的增长，预计2013年还会持续大幅增长。如今的北高智，代理的产品线超过30条，产品涉及消费电子、手机通讯、LED照明、工业控制多个领域。这次的约访对象是北高智北中国区的总经理Alex Lin，希望透过他来了解分销商们对本土IC行业现状的一些看...[详细]
美媒：中国要求车厂2019年起生产电动汽车

　　美媒称，据9月28日发布的管理办法，中国将开始要求大多数在华经营的汽车厂商从2019年起生产新能源汽车，这一时限较此前在一份草案中提出的时间延后了一年。下面就随汽车电子小编一起来了解一下相关内容吧。　　据美国《华尔街日报》9月28日报道，该管理办法包括一项积分比例要求，按照该要求，车企要通过生产新能源汽车来产生积分。所谓新能源汽车是指纯电动汽车和插电式混合动力汽车。　　报道称，工业...[详细]
【技能秒get】这才是纹波的正确测试方法

通过实验对比，得出用示波器探头测试电源输出纹波的一些需要注意的问题。纹波比较准确的测试方式是，将探头的帽子拿到，在探针与外环地上并入一个0.1uf瓷片电容或者0.1uf瓷片电容加1uf电解电容，再进行纹波测试，为准确的方式。因为示波器探针在普通的情况下也等于一根天线，可以接收一些外面的干扰杂讯。影响实际的测试效果。需要在最外端加入旁路电容，将高频干扰杂讯滤除。 1. 采用最原始的...[详细]
具有故障保护功能的数据采集系统

传感系统，例如飞机中的传感系统，必须经得起各种故障条件从而避免元件和系统损坏，因为一次传感器失效可能会导致灾难性事件的发生。由两个n沟道 MOSFET与一个p沟道MOSFET串联组成的通道保护器，无论有没有接通电源，都能保护传感元件不受信号途径上瞬态电压的影响（图 1）。在正常工作期间，通道保护器起串联电阻器的作用。如果输入电压超过电源电压，三只MOSFET中就有一只关断，将输出箝位在电源电压以内...[详细]
STM32CubeMonitor，可灵活支持多个操作系统

意法半导体新推出的STM32CubeMonitor软件工具能够实时显示STM32应用程序运行时的变量，同时让开发人员能够在所选的操作系统环境（Windows®、Linux或MacOS®）中自定义图形可视化设置。 STM32CubeMonitor拥有丰富而强大的测试诊断功能，方便开发者获取有价值的诊断方案。在软件的图形流编辑器中，用户只要用鼠标拖放项目和功能，即可创建自定义的仪表板，快速...[详细]
FPGA在广播视频处理中的应用

1.时机在世界范围内，广播视频系统的需求都在逐年显著增加，原因是以下的一些因素同时发生了作用：可供观众选择的广播频道的增加。世界范围内，更多观众的选择从很少的几个频道发展到几百个频道。有线电视和卫星电视传送网的发展。而互联网广播的发展更使我们进入了“百万频道”的时代。用于信息交流，培训和安全保卫的公司内部和小范围的小型广播视频应用的发展。在娱乐业界也有了长足的发展，例如饭店视频点播、...[详细]
美国研发新型锂电池混合固体电解质

美国能源部劳伦斯伯克利国家实验室科学家Nitash Balsara以及北卡罗莱纳大学Joseph DeSimone联合研究出一种高导电混合电解质，结合了两种主要类型的固体电解质——聚合物和玻璃。 Balsara表示：“电解质具有兼容性，在电池进行循环时，可变形保持与电极接触。这种固体电解质还具有前所未有室温电导率。” 电解质承载电池阴极和阳极之间的电荷，大...[详细]
行业龙头英飞凌加码、比亚迪跑步入场，IGBT究竟多香？

集微网消息，IGBT（Insulated Gate Bipolar Transistor）行业龙头老大英飞凌近日宣布将新增在华投资，扩大其无锡工厂的IGBT模块生产线。英飞凌表示将以更丰富的IGBT产品线，满足快速增长的可再生能源、新能源汽车等领域的应用需求。英飞凌的再次加码，除了巩固其在全球IGBT业务的领导地位之外，另一方面，也是看中了IGBT未来的潜力。潜力无穷事实上，被誉为电力电子...[详细]
贵州上市公司引领新能源电池及材料产业量质双升

截至4月底，贵州36家上市公司2022年年度报告已披露完毕，盈利能力、资产质量、融资规模等方面均取得较好成绩，营业总收入突破3000亿元，创历史新高，净利润增幅远超全国平均水平，整体资产规模与质量稳定增长，辖区上市公司高质量发展稳步推进。其中，以中伟股份、 ... ...[详细]
紫光集团资本运作变阵携手地方加快产业落地

数字经济的基础是什么？在昨日的乌镇，第四届世界互联网大会“全球数字经济论坛”上，紫光集团董事长赵伟国给出了他的答案：如果说网络、计算、存储支撑着互联网、云计算和大数据，那么，更为基础的支撑要素就是芯片（集成电路）。而紫光集团所做的，就是聚焦于集成电路，为数字经济打好“基础的基础”。携手地方加快产业落地赵伟国给紫光集团设计了一条独特的发展路径：理念上以自主创新为宗旨，行动上以跨境并购打头阵。然...[详细]
基于DSP+μC／OS-Ⅱ的励磁系统的研究

同步电动机的磁场采用直流励磁，功率因数可以超前、滞后或单位功率因数，运行中可以向电网馈送无功功率，改善电网功率因数，并且具有运行稳定性好、转速不随负载变化而改变和运行效率高等特点，因此在煤矿等工业现场应用广泛。而同步电动机励磁对于同步电动机的运行起到重要作用，传统励磁系统采用晶闸管移相全桥电路实现励磁。该励磁系统需要用到同步变压器定相，涉及器件较多，维护复杂，影响了励磁系统的安全运行，而且...[详细]
STM32单片机(9) LCD1602显示屏输出实验

核心代码由kingsraywii提供,本文作者对其进行整合、更详细地注释和部分代码改进, 添加了芯片ID获取、字符串输出和printf重定向输出功能注:使用普中科技开发板测试时,需要拔掉Boot1插口接5V电压,重启 /******************************************************************************* * ...[详细]
为什么串口比并口快？

作为一个电路设计师，我整个职业生涯都花在接口电路上，串行并行都做过，且速度不慢（DDR3-1600Mbps, SerDes 30Gbps），这个问题不答实在技痒难耐。已经看到的答案中，大家基本上都命中了关键的知识点，但是没有把背后的逻辑说清楚，也没有人从电学特性和经济的角度分析这个问题。大言不惭，欢迎大家拍砖。 ----------补充-------- 名词解释: Mbps, Gbps:...[详细]
Vicor发布针对汽车、高性能计算及机器人在疫情下的发展预测

： Nicolas Richard、Lev Slutskiy、o DeLellis 和 Henryk Dabrowski 因疫情原因改变了大家的习惯、优先次序被重新定义，这些推动了创新加速。Vicor 的专家列举了一些示例，表明现有趋势将大幅加快并重点关注高效、紧凑、供电网络（PDN）的使用与发展。汽车 — Vicor 公司欧洲、中东和非洲汽车业务开发总监 Nicolas Ric...[详细]
非接触式温度传感器广泛用在哪些场合?

非接触式温度传感器是一种利用红外技术测量物体表面温度的传感器，具有测量速度快、精度高、不受环境影响等优点。在许多领域和场合中得到了广泛的应用。以下是对非接触式温度传感器应用场合的介绍：工业生产领域工业生产领域是使用非接触式温度传感器最为广泛的领域之一。在生产过程中，温度控制对于产品质量和生产效率具有至关重要的作用。非接触式温度传感器可以实时监测设备和产品的温度，确保生产过程的稳定性和可靠...[详细]