Home | History | Annotate | Download | only in doc
      1 AMCC suggested to set the PMU bit to 0 for best performace on the
      2 PPC440 DDR controller. The 440er common DDR setup files (sdram.c &
      3 spd_sdram.c) are changed accordingly. So all 440er boards using
      4 these setup routines will automatically receive this performance
      5 increase.
      6 
      7 Please see below some benchmarks done by AMCC to demonstrate this
      8 performance changes:
      9 
     10 
     11 ----------------------------------------
     12 SDRAM0_CFG0[PMU] = 1 (U-Boot default for Bamboo, Yosemite and Yellowstone)
     13 ----------------------------------------
     14 Stream benchmark results
     15 -------------------------------------------------------------
     16 This system uses 8 bytes per DOUBLE PRECISION word.
     17 -------------------------------------------------------------
     18 Array size = 2000000, Offset = 0
     19 Total memory required = 45.8 MB.
     20 Each test is run 10 times, but only
     21 the *best* time for each is used.
     22 -------------------------------------------------------------
     23 Your clock granularity/precision appears to be 1 microseconds.
     24 Each test below will take on the order of 112345 microseconds.
     25    (= 112345 clock ticks)
     26 Increase the size of the arrays if this shows that you are not getting
     27 at least 20 clock ticks per test.
     28 -------------------------------------------------------------
     29 WARNING -- The above is only a rough guideline.
     30 For best results, please be sure you know the precision of your system
     31 timer.
     32 -------------------------------------------------------------
     33 Function      Rate (MB/s)   RMS time     Min time     Max time
     34 Copy:         256.7683       0.1248       0.1246       0.1250
     35 Scale:        246.0157       0.1302       0.1301       0.1302
     36 Add:          255.0316       0.1883       0.1882       0.1885
     37 Triad:        253.1245       0.1897       0.1896       0.1899
     38 
     39 
     40 TTCP Benchmark Results
     41 ttcp-t: socket
     42 ttcp-t: connect
     43 ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000  tcp  ->
     44 localhost
     45 ttcp-t: 16777216 bytes in 0.28 real seconds = 454.29 Mbit/sec +++
     46 ttcp-t: 2048 I/O calls, msec/call = 0.14, calls/sec = 7268.57
     47 ttcp-t: 0.0user 0.1sys 0:00real 60% 0i+0d 0maxrss 0+2pf 3+1506csw
     48 
     49 ----------------------------------------
     50 SDRAM0_CFG0[PMU] = 0 (Suggested modification)
     51 Setting PMU = 0 provides a noticeable performance improvement *2% to
     52 5% improvement in memory performance.
     53 *Improves the Mbit/sec for TTCP benchmark by almost 76%.
     54 ----------------------------------------
     55 Stream benchmark results
     56 -------------------------------------------------------------
     57 This system uses 8 bytes per DOUBLE PRECISION word.
     58 -------------------------------------------------------------
     59 Array size = 2000000, Offset = 0
     60 Total memory required = 45.8 MB.
     61 Each test is run 10 times, but only
     62 the *best* time for each is used.
     63 -------------------------------------------------------------
     64 Your clock granularity/precision appears to be 1 microseconds.
     65 Each test below will take on the order of 120066 microseconds.
     66    (= 120066 clock ticks)
     67 Increase the size of the arrays if this shows that you are not getting
     68 at least 20 clock ticks per test.
     69 -------------------------------------------------------------
     70 WARNING -- The above is only a rough guideline.
     71 For best results, please be sure you know the precision of your system
     72 timer.
     73 -------------------------------------------------------------
     74 Function      Rate (MB/s)   RMS time     Min time     Max time
     75 Copy:         262.5167       0.1221       0.1219       0.1223
     76 Scale:        258.4856       0.1238       0.1238       0.1240
     77 Add:          262.5404       0.1829       0.1828       0.1831
     78 Triad:        266.8594       0.1800       0.1799       0.1802
     79 
     80 TTCP Benchmark Results
     81 ttcp-t: socket
     82 ttcp-t: connect
     83 ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000  tcp  ->
     84 localhost
     85 ttcp-t: 16777216 bytes in 0.16 real seconds = 804.06 Mbit/sec +++
     86 ttcp-t: 2048 I/O calls, msec/call = 0.08, calls/sec = 12864.89
     87 ttcp-t: 0.0user 0.0sys 0:00real 46% 0i+0d 0maxrss 0+2pf 120+1csw
     88 
     89 
     90 2006-07-28, Stefan Roese <sr (a] denx.de>
     91