1-
2- ====================================
1+ #################################
32 Linux 12.00.00 Performance Guide
4- ====================================
3+ #################################
54
6- .. rubric :: **Read This First**
7- :name: read-this-first-kernel-perf-guide
5+ ***************
6+ Read This First
7+ ***************
88
99**All performance numbers provided in this document are gathered using
1010following Evaluation Modules unless otherwise specified. **
@@ -17,27 +17,30 @@ following Evaluation Modules unless otherwise specified.**
1717
1818Table: Evaluation Modules
1919
20- .. rubric :: About This Manual
21- :name: about-this-manual-kernel-perf-guide
20+ *****************
21+ About This Manual
22+ *****************
2223
2324This document provides performance data for each of the device drivers
2425which are part of the Processor SDK Linux package. This document should be
2526used in conjunction with release notes and user guides provided with the
2627Processor SDK Linux package for information on specific issues present
2728with drivers included in a particular release.
2829
29- .. rubric :: If You Need Assistance
30- :name: if-you-need-assistance-kernel-perf-guide
31-
3230For further information or to report any problems, contact
3331https://e2e.ti.com/ or https://support.ti.com/
3432
33+ |
3534
35+ *****************
3636System Benchmarks
37- -----------------
37+ *****************
38+
39+ |
3840
3941LMBench
40- ^^^^^^^
42+ =======
43+
4144LMBench is a collection of microbenchmarks of which the memory bandwidth
4245and latency related ones are typically used to estimate processor
4346memory system performance. More information about lmbench at
@@ -181,7 +184,8 @@ Execute the LMBench with the following:
181184 "tcp_latency_using_localhost (microsec)","0.86"
182185
183186Dhrystone
184- ^^^^^^^^^
187+ =========
188+
185189Dhrystone is a core only benchmark that runs from warm L1 caches in all
186190modern processors. It scales linearly with clock speed.
187191
@@ -203,7 +207,8 @@ Execute the benchmark with the following:
203207 "dhrystone_per_second (dhrystonep)","6250000.00"
204208
205209Whetstone
206- ^^^^^^^^^
210+ =========
211+
207212Whetstone is a benchmark primarily measuring floating-point arithmetic performance.
208213
209214Execute the benchmark with the following:
@@ -218,7 +223,8 @@ Execute the benchmark with the following:
218223 "whetstone (mips)","5000.00"
219224
220225Linpack
221- ^^^^^^^
226+ =======
227+
222228Linpack measures peak double precision (64 bit) floating point performance in
223229solving a dense linear system.
224230
@@ -228,7 +234,8 @@ solving a dense linear system.
228234 "linpack (kflops)","516120.50 (min 514627.00, max 517614.00)"
229235
230236NBench
231- ^^^^^^
237+ ======
238+
232239NBench which stands for Native Benchmark is used to measure macro benchmarks
233240for commonly used operations such as sorting and analysis algorithms.
234241More information about NBench at
@@ -249,7 +256,8 @@ https://nbench.io/articles/index.html
249256 "string_sort (iterations)","150.19 (min 150.17, max 150.21)"
250257
251258Stream
252- ^^^^^^
259+ ======
260+
253261STREAM is a microbenchmark for measuring data memory system performance without
254262any data reuse. It is designed to miss on caches and exercise data prefetcher
255263and speculative accesses.
@@ -275,7 +283,8 @@ Execute the benchmark with the following:
275283 "triad (mb/s)","1512.27 (min 1488.30, max 1526.00)"
276284
277285CoreMarkPro
278- ^^^^^^^^^^^
286+ ===========
287+
279288CoreMark®-Pro is a comprehensive, advanced processor benchmark that works with
280289and enhances the market-proven industry-standard EEMBC CoreMark® benchmark.
281290While CoreMark stresses the CPU pipeline, CoreMark-Pro tests the entire processor,
@@ -311,7 +320,8 @@ and floating-point workloads, and data sets for utilizing larger memory subsyste
311320 "zip-test (workloads/)","33.37 (min 32.26, max 34.48)"
312321
313322MultiBench
314- ^^^^^^^^^^
323+ ==========
324+
315325MultiBench™ is a suite of benchmarks that allows processor and system designers to
316326analyze, test, and improve multicore processors. It uses three forms of concurrency:
317327Data decomposition: multiple threads cooperating on achieving a unified goal and
@@ -359,11 +369,13 @@ thread-enabled workloads to be tested.
359369 "x264-4mq (workloads/)","0.50 (min 0.50, max 0.51)"
360370 "x264-4mqw1 (workloads/)","0.50 (min 0.49, max 0.51)"
361371
372+ |
373+
362374Boot-time Measurement
363- ---------------------
375+ =====================
364376
365377Boot media: MMCSD
366- ^^^^^^^^^^^^^^^^^
378+ -----------------
367379
368380.. csv-table :: Linux boot time MMCSD
369381 :header: "Boot Configuration","am62lxx_evm-fs: Boot time in seconds: avg(min,max)"
@@ -375,7 +387,7 @@ Boot time numbers [avg, min, max] are measured from "Starting kernel" to Linux p
375387|
376388
377389ALSA SoC Audio Driver
378- ---------------------
390+ =====================
379391
380392#. Access type - RW\_ INTERLEAVED
381393#. Channels - 2
@@ -411,7 +423,8 @@ ALSA SoC Audio Driver
411423|
412424
413425Ethernet
414- --------
426+ ========
427+
415428Ethernet performance benchmarks were measured using :command: `netperf ` 2.7.1 https://hewlettpackard.github.io/netperf/doc/netperf.html
416429Test procedures were modeled after those defined in RFC-2544:
417430https://tools.ietf.org/html/rfc2544, where the DUT is the TI device
@@ -477,27 +490,27 @@ Running the following commands will trigger :command:`netperf` clients to measur
477490 netperf -H <DUT ip> -j -C -l 60 -t UDP_STREAM -b <burst_size> -w <wait_time> -- -m <UDP datagram size>
478491 -k DIRECTION,THROUGHPUT,MEAN_LATENCY,LOCAL_CPU_UTIL,REMOTE_CPU_UTIL,LOCAL_BYTES_SENT,REMOTE_BYTES_RECVD,LOCAL_SEND_SIZE
479492
480- CPSW/CPSW2g/CPSW3g Ethernet Driver
481- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
493+ CPSW/CPSW2g/CPSW3g Ethernet
494+ ---------------------------
482495
483- .. rubric :: TCP Bidirectional Throughput
484- :name: CPSW2g-tcp-bidirectional-throughput
496+ TCP Bidirectional Throughput
497+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
485498
486499.. csv-table :: CPSW2g TCP Bidirectional Throughput
487500 :header: "Command Used","am62lxx_evm-fs: THROUGHPUT (Mbits/sec)","am62lxx_evm-fs: CPU Load % (LOCAL_CPU_UTIL)"
488501
489502 "netperf -H 192.168.0.1 -j -c -C -l 60 -t TCP_STREAM; netperf -H 192.168.0.1 -j -c -C -l 60 -t TCP_MAERTS","1048.02 (min 1035.65, max 1060.38)","99.10 (min 98.47, max 99.72)"
490503
491- .. rubric :: TCP Bidirectional Throughput Interrupt Pacing
492- :name: CPSW2g-tcp-bidirectional-throughput-interrupt-pacing
504+ TCP Bidirectional Throughput Interrupt Pacing
505+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
493506
494507.. csv-table :: CPSW2g TCP Bidirectional Throughput Interrupt Pacing
495508 :header: "Command Used","am62lxx_evm-fs: THROUGHPUT (Mbits/sec)","am62lxx_evm-fs: CPU Load % (LOCAL_CPU_UTIL)"
496509
497510 "netperf -H 192.168.0.1 -j -c -C -l 60 -t TCP_STREAM; netperf -H 192.168.0.1 -j -c -C -l 60 -t TCP_MAERTS","1182.95 (min 1179.68, max 1186.21)","95.50 (min 95.39, max 95.60)"
498511
499- .. rubric :: UDP Throughput
500- :name: CPSW2g-udp-throughput-0-loss
512+ UDP Throughput
513+ ^^^^^^^^^^^^^^
501514
502515.. csv-table :: CPSW2g UDP Egress Throughput 0 loss
503516 :header: "Frame Size(bytes)","am62lxx_evm-fs: UDP Datagram Size(bytes) (LOCAL_SEND_SIZE)","am62lxx_evm-fs: THROUGHPUT (Mbits/sec)","am62lxx_evm-fs: Packets Per Second (kPPS)","am62lxx_evm-fs: CPU Load % (LOCAL_CPU_UTIL)"
@@ -528,8 +541,9 @@ CPSW/CPSW2g/CPSW3g Ethernet Driver
528541
529542|
530543
531- EMMC Driver
532- -----------
544+ EMMC
545+ ====
546+
533547.. warning ::
534548
535549 **IMPORTANT **: The performance numbers can be severely affected if the media is
@@ -539,7 +553,7 @@ EMMC Driver
539553 re-mount in async mode.
540554
541555EMMC EXT4 FIO 1G
542- ^^^^^^^^^^^^^^^^
556+ ----------------
543557
544558.. csv-table :: EMMC EXT4 FIO 1G
545559 :header: "Buffer size (bytes)","am62lxx_evm-fs: Write EXT4 Throughput (Mbytes/sec)","am62lxx_evm-fs: Write EXT4 CPU Load (%)","am62lxx_evm-fs: Read EXT4 Throughput (Mbytes/sec)","am62lxx_evm-fs: Read EXT4 CPU Load (%)"
@@ -550,7 +564,7 @@ EMMC EXT4 FIO 1G
550564 "256k","123.00","12.30 (min 11.83, max 13.08)","177.67 (min 175.00, max 179.00)","10.40 (min 9.43, max 11.55)"
551565
552566EMMC EXT4
553- ^^^^^^^^^
567+ ---------
554568
555569.. csv-table :: EMMC EXT4
556570 :header: "Buffer size (bytes)","am62lxx_evm-fs: Write EXT4 Throughput (Mbytes/sec)","am62lxx_evm-fs: Write EXT4 CPU Load (%)","am62lxx_evm-fs: Read EXT4 Throughput (Mbytes/sec)","am62lxx_evm-fs: Read EXT4 CPU Load (%)"
@@ -562,7 +576,7 @@ EMMC EXT4
562576 "5242880","107.71 (min 100.42, max 111.86)","12.14 (min 10.47, max 15.05)","187.16 (min 187.07, max 187.23)","23.25 (min 22.73, max 24.07)"
563577
564578EMMC VFAT
565- ^^^^^^^^^
579+ ---------
566580
567581.. csv-table :: EMMC VFAT
568582 :header: "Buffer size (bytes)","am62lxx_evm-fs: Write VFAT Throughput (Mbytes/sec)","am62lxx_evm-fs: Write VFAT CPU Load (%)","am62lxx_evm-fs: Read VFAT Throughput (Mbytes/sec)","am62lxx_evm-fs: Read VFAT CPU Load (%)"
@@ -573,8 +587,8 @@ EMMC VFAT
573587 "1048576","61.77 (min 22.07, max 76.62)","19.90 (min 18.12, max 22.93)","175.57 (min 174.58, max 177.21)","35.72 (min 33.04, max 40.52)"
574588 "5242880","71.75 (min 23.10, max 85.16)","21.11 (min 19.44, max 22.40)","176.47 (min 176.20, max 176.65)","33.14 (min 32.48, max 34.45)"
575589
576- UBoot EMMC Driver
577- -----------------
590+ UBoot EMMC
591+ ----------
578592
579593.. csv-table :: UBOOT EMMC RAW
580594 :header: "File size (bytes in hex)","am62lxx_evm-fs: Write Throughput (Kbytes/sec)","am62lxx_evm-fs: Read Throughput (Kbytes/sec)"
@@ -583,7 +597,7 @@ UBoot EMMC Driver
583597 "4000000","124830.48","178086.96"
584598
585599MMCSD
586- -----
600+ =====
587601
588602.. warning ::
589603
@@ -594,7 +608,7 @@ MMCSD
594608 re-mount in async mode.
595609
596610MMC EXT4 FIO 1G
597- ^^^^^^^^^^^^^^^
611+ ---------------
598612
599613.. csv-table :: MMC EXT4 FIO 1G
600614 :header: "Buffer size (bytes)","am62lxx_evm-fs: Write EXT4 Throughput (Mbytes/sec)","am62lxx_evm-fs: Write EXT4 CPU Load (%)","am62lxx_evm-fs: Read EXT4 Throughput (Mbytes/sec)","am62lxx_evm-fs: Read EXT4 CPU Load (%)"
@@ -605,7 +619,7 @@ MMC EXT4 FIO 1G
605619 "256k","39.40 (min 38.90, max 39.90)","5.83 (min 5.38, max 6.28)","83.85 (min 83.70, max 84.00)","6.30 (min 6.26, max 6.34)"
606620
607621MMC EXT4
608- ^^^^^^^^
622+ --------
609623
610624.. csv-table :: MMC EXT4
611625 :header: "Buffer size (bytes)","am62lxx_evm-fs: Write Raw Throughput (Mbytes/sec)","am62lxx_evm-fs: Write Raw CPU Load (%)","am62lxx_evm-fs: Read Raw Throughput (Mbytes/sec)","am62lxx_evm-fs: Read Raw CPU Load (%)"
@@ -640,11 +654,11 @@ The performance numbers were captured using the following:
640654
641655|
642656
643- USB Driver
644- ----------
657+ USB
658+ ===
645659
646660USB Device Controller
647- ^^^^^^^^^^^^^^^^^^^^^
661+ ---------------------
648662
649663.. csv-table :: USBDEVICE HIGHSPEED SLAVE_READ_THROUGHPUT
650664 :header: "Number of Blocks","am62lxx_evm-fs: Throughput (MB/sec)"
@@ -657,11 +671,13 @@ USB Device Controller
657671
658672 "150","31.63 (min 31.20, max 32.20)"
659673
660- CRYPTO Driver
661- -------------
674+ |
675+
676+ CRYPTO
677+ ======
662678
663679OpenSSL Performance
664- ^^^^^^^^^^^^^^^^^^^
680+ -------------------
665681
666682.. csv-table :: OpenSSL Performance
667683 :header: "Algorithm","Buffer Size (in bytes)","am62lxx_evm-fs: throughput (KBytes/Sec)"
@@ -762,3 +778,5 @@ benchmark test.
762778.. code-block :: console
763779
764780 time -v openssl speed -elapsed -evp aes-128-cbc
781+
782+ |
0 commit comments