Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-35295][ML] Replace fully com.github.fommil.netlib by dev.ludov…
…ic.netlib:2.0 ### What changes were proposed in this pull request? Bump to `dev.ludovic.netlib:2.0` which provides JNI-based wrappers for BLAS, ARPACK, and LAPACK. Theseare not taking dependencies on GPL or LGPL libraries, allowing to provide out-of-the-box support for hardware acceleration when a native library is present (this is still up to the end-user to install such library on their system, like OpenBLAS, Intel MKL, and libarpack2). ### Why are the changes needed? Great performance improvement for ML-related workload on vanilla-distributions of Spark. ### Does this PR introduce _any_ user-facing change? Users now take advantage of hardware acceleration as long as a native library is installed (like OpenBLAS, Intel MKL and libarpack2). ### How was this patch tested? Spark test-suite + dev.ludovic.netlib testsuite. #### JDK8: ``` [info] OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java8BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 220 226 6 454.9 2.2 1.0X [info] java 221 228 5 451.9 2.2 1.0X [info] native 209 215 5 478.7 2.1 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 121 125 3 823.3 1.2 1.0X [info] java 121 125 3 824.3 1.2 1.0X [info] native 101 105 3 988.4 1.0 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 212 219 6 470.9 2.1 1.0X [info] java 208 212 4 481.0 2.1 1.0X [info] native 209 215 5 478.5 2.1 1.0X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 114 119 3 878.9 1.1 1.0X [info] java 99 105 3 1011.4 1.0 1.2X [info] native 97 103 3 1026.7 1.0 1.2X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 111 2 925.9 1.1 1.0X [info] java 71 73 2 1414.9 0.7 1.5X [info] native 54 56 2 1847.0 0.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 2 1046.8 1.0 1.0X [info] java 47 48 1 2129.8 0.5 2.0X [info] native 29 30 1 3404.7 0.3 3.3X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 139 143 2 718.2 1.4 1.0X [info] java 46 47 1 2171.2 0.5 3.0X [info] native 44 46 2 2261.8 0.4 3.1X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 154 157 4 651.0 1.5 1.0X [info] java 40 42 1 2469.3 0.4 3.8X [info] native 26 27 1 3787.6 0.3 5.8X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 185 195 8 541.0 1.8 1.0X [info] java 186 196 7 538.5 1.9 1.0X [info] native 177 187 7 564.1 1.8 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 98 102 3 1016.2 1.0 1.0X [info] java 98 102 3 1017.8 1.0 1.0X [info] native 87 91 3 1143.2 0.9 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 68 70 1 1474.7 0.7 1.0X [info] java 51 52 1 1973.0 0.5 1.3X [info] native 30 32 1 3298.8 0.3 2.2X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 99 2 1037.9 1.0 1.0X [info] java 50 51 1 1999.6 0.5 1.9X [info] native 30 31 1 3368.1 0.3 3.2X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 59 61 1 1688.7 0.6 1.0X [info] java 41 42 1 2461.9 0.4 1.5X [info] native 15 16 1 6593.0 0.2 3.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 90 92 1 1116.2 0.9 1.0X [info] java 39 40 1 2565.8 0.4 2.3X [info] native 15 16 1 6594.2 0.2 5.9X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 192 202 7 520.5 1.9 1.0X [info] java 203 214 7 491.9 2.0 0.9X [info] native 176 187 7 568.8 1.8 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 59 61 1 846.1 1.2 1.0X [info] java 38 39 1 1313.5 0.8 1.6X [info] native 24 27 1 2047.8 0.5 2.4X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 97 101 3 515.4 1.9 1.0X [info] java 97 101 2 515.1 1.9 1.0X [info] native 88 91 3 569.1 1.8 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 169 174 3 295.4 3.4 1.0X [info] java 169 174 3 295.4 3.4 1.0X [info] native 160 165 4 312.2 3.2 1.1X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 561 577 13 1782.3 0.6 1.0X [info] java 225 231 4 4446.2 0.2 2.5X [info] native 31 32 3 32473.1 0.0 18.2X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 570 584 9 1754.8 0.6 1.0X [info] java 224 230 4 4457.3 0.2 2.5X [info] native 31 32 1 32493.4 0.0 18.5X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 855 866 6 1169.2 0.9 1.0X [info] java 224 228 3 4466.9 0.2 3.8X [info] native 31 32 1 32395.5 0.0 27.7X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1328 1344 8 752.8 1.3 1.0X [info] java 224 230 4 4458.9 0.2 5.9X [info] native 31 32 1 32201.8 0.0 42.8X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 534 541 5 1873.0 0.5 1.0X [info] java 220 224 3 4542.8 0.2 2.4X [info] native 15 16 1 66803.1 0.0 35.7X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 544 551 6 1839.6 0.5 1.0X [info] java 220 224 4 4538.2 0.2 2.5X [info] native 15 16 1 65589.9 0.0 35.7X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 833 845 21 1201.0 0.8 1.0X [info] java 220 224 3 4548.7 0.2 3.8X [info] native 15 16 1 66603.2 0.0 55.5X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 899 907 5 1112.9 0.9 1.0X [info] java 221 224 2 4531.6 0.2 4.1X [info] native 15 16 1 65944.9 0.0 59.3X ``` #### JDK11: ``` [info] OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java11BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 195 200 3 512.2 2.0 1.0X [info] java 197 202 3 507.0 2.0 1.0X [info] native 184 189 4 543.0 1.8 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 112 3 921.8 1.1 1.0X [info] java 101 105 3 989.4 1.0 1.1X [info] native 87 91 3 1147.1 0.9 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 187 191 3 535.1 1.9 1.0X [info] java 182 188 3 548.8 1.8 1.0X [info] native 178 182 3 562.2 1.8 1.1X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 110 114 3 909.3 1.1 1.0X [info] java 86 93 4 1159.3 0.9 1.3X [info] native 86 90 3 1162.4 0.9 1.3X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 106 108 2 943.6 1.1 1.0X [info] java 70 71 2 1426.8 0.7 1.5X [info] native 54 56 2 1835.4 0.5 1.9X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 1 1047.1 1.0 1.0X [info] java 43 44 1 2331.9 0.4 2.2X [info] native 29 30 1 3392.1 0.3 3.2X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 114 115 2 880.7 1.1 1.0X [info] java 42 43 1 2398.1 0.4 2.7X [info] native 45 46 1 2233.3 0.4 2.5X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 140 143 2 714.6 1.4 1.0X [info] java 28 29 1 3531.0 0.3 4.9X [info] native 26 27 1 3820.0 0.3 5.3X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 156 166 7 641.3 1.6 1.0X [info] java 158 167 6 633.2 1.6 1.0X [info] native 150 160 7 664.8 1.5 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 88 2 1181.7 0.8 1.0X [info] java 85 88 2 1176.0 0.9 1.0X [info] native 75 78 2 1333.2 0.8 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 58 59 1 1731.1 0.6 1.0X [info] java 41 43 1 2415.5 0.4 1.4X [info] native 30 31 1 3293.9 0.3 1.9X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 94 96 1 1063.4 0.9 1.0X [info] java 41 42 1 2435.8 0.4 2.3X [info] native 30 30 1 3379.8 0.3 3.2X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 44 45 1 2278.9 0.4 1.0X [info] java 37 38 0 2686.8 0.4 1.2X [info] native 15 16 1 6555.4 0.2 2.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 88 89 1 1142.1 0.9 1.0X [info] java 33 34 1 3010.7 0.3 2.6X [info] native 15 16 1 6553.9 0.2 5.7X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 164 172 4 609.4 1.6 1.0X [info] java 163 172 5 612.6 1.6 1.0X [info] native 150 159 4 667.0 1.5 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 49 50 1 1029.4 1.0 1.0X [info] java 41 42 1 1209.4 0.8 1.2X [info] native 25 27 1 2029.2 0.5 2.0X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 80 85 3 622.2 1.6 1.0X [info] java 80 85 3 622.4 1.6 1.0X [info] native 75 79 3 668.7 1.5 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 137 142 3 364.1 2.7 1.0X [info] java 139 142 2 360.4 2.8 1.0X [info] native 131 135 3 380.4 2.6 1.0X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 517 525 5 1935.5 0.5 1.0X [info] java 213 216 3 4704.8 0.2 2.4X [info] native 31 31 1 32705.6 0.0 16.9X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 589 601 6 1698.6 0.6 1.0X [info] java 213 217 3 4693.3 0.2 2.8X [info] native 31 32 1 32498.9 0.0 19.1X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 851 865 6 1175.3 0.9 1.0X [info] java 212 216 3 4717.0 0.2 4.0X [info] native 30 32 1 32903.0 0.0 28.0X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1301 1316 6 768.4 1.3 1.0X [info] java 212 216 2 4717.4 0.2 6.1X [info] native 31 32 1 32606.0 0.0 42.4X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 454 460 2 2203.0 0.5 1.0X [info] java 208 212 3 4803.8 0.2 2.2X [info] native 15 16 0 66586.0 0.0 30.2X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 529 536 4 1889.7 0.5 1.0X [info] java 208 212 3 4798.6 0.2 2.5X [info] native 15 16 1 66751.4 0.0 35.3X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 830 840 5 1205.1 0.8 1.0X [info] java 208 211 2 4814.1 0.2 4.0X [info] native 15 15 1 67676.4 0.0 56.2X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 894 907 7 1118.7 0.9 1.0X [info] java 208 211 3 4809.6 0.2 4.3X [info] native 15 16 1 66675.2 0.0 59.6X ``` #### JDK16: ``` [info] OpenJDK 64-Bit Server VM 16+36 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.VectorBLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 193 199 3 517.5 1.9 1.0X [info] java 181 186 4 553.2 1.8 1.1X [info] native 181 185 5 553.6 1.8 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 112 2 925.1 1.1 1.0X [info] java 88 91 3 1138.6 0.9 1.2X [info] native 87 91 3 1144.2 0.9 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 184 189 3 542.5 1.8 1.0X [info] java 181 185 3 552.8 1.8 1.0X [info] native 179 183 2 558.0 1.8 1.0X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 97 101 3 1031.6 1.0 1.0X [info] java 86 90 2 1163.7 0.9 1.1X [info] native 85 88 2 1182.9 0.8 1.1X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 107 109 2 932.4 1.1 1.0X [info] java 54 56 2 1846.7 0.5 2.0X [info] native 54 56 2 1846.7 0.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 1 1043.6 1.0 1.0X [info] java 29 30 1 3439.3 0.3 3.3X [info] native 29 30 1 3423.9 0.3 3.3X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 121 123 2 829.8 1.2 1.0X [info] java 32 32 1 3171.3 0.3 3.8X [info] native 45 46 1 2246.2 0.4 2.7X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 142 144 2 705.9 1.4 1.0X [info] java 15 16 1 6585.8 0.2 9.3X [info] native 26 27 1 3839.5 0.3 5.4X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 157 165 5 635.6 1.6 1.0X [info] java 151 159 5 664.0 1.5 1.0X [info] native 151 160 5 663.6 1.5 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 89 2 1172.3 0.9 1.0X [info] java 75 79 3 1337.3 0.7 1.1X [info] native 75 79 2 1335.5 0.7 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 58 59 1 1731.5 0.6 1.0X [info] java 28 29 1 3544.2 0.3 2.0X [info] native 30 31 1 3306.2 0.3 1.9X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 90 92 1 1108.3 0.9 1.0X [info] java 28 28 1 3622.5 0.3 3.3X [info] native 30 31 1 3381.3 0.3 3.1X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 44 45 1 2284.7 0.4 1.0X [info] java 14 15 1 7034.0 0.1 3.1X [info] native 15 16 1 6643.7 0.2 2.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 86 1 1177.4 0.8 1.0X [info] java 15 15 1 6886.1 0.1 5.8X [info] native 15 16 1 6560.1 0.2 5.6X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 164 173 6 608.1 1.6 1.0X [info] java 148 157 5 675.2 1.5 1.1X [info] native 152 160 5 659.9 1.5 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 61 63 1 815.4 1.2 1.0X [info] java 16 17 1 3104.3 0.3 3.8X [info] native 24 27 1 2071.9 0.5 2.5X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 81 85 2 616.4 1.6 1.0X [info] java 81 85 2 614.7 1.6 1.0X [info] native 75 78 2 669.5 1.5 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 138 141 3 362.7 2.8 1.0X [info] java 137 140 2 365.3 2.7 1.0X [info] native 131 134 2 382.9 2.6 1.1X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 525 544 8 1906.2 0.5 1.0X [info] java 61 68 3 16358.1 0.1 8.6X [info] native 31 32 1 32623.7 0.0 17.1X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 580 598 12 1724.5 0.6 1.0X [info] java 61 68 4 16302.5 0.1 9.5X [info] native 30 32 1 32962.8 0.0 19.1X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 829 838 4 1206.2 0.8 1.0X [info] java 61 69 3 16339.7 0.1 13.5X [info] native 30 31 1 33231.9 0.0 27.6X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1352 1363 5 739.6 1.4 1.0X [info] java 61 69 3 16347.0 0.1 22.1X [info] native 31 32 1 32740.3 0.0 44.3X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 482 493 7 2073.1 0.5 1.0X [info] java 35 38 2 28315.3 0.0 13.7X [info] native 15 15 1 67579.7 0.0 32.6X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 472 482 4 2119.0 0.5 1.0X [info] java 36 38 2 28138.1 0.0 13.3X [info] native 15 16 1 66616.5 0.0 31.4X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 823 830 5 1215.2 0.8 1.0X [info] java 35 38 2 28681.4 0.0 23.6X [info] native 15 15 1 67908.4 0.0 55.9X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 896 908 7 1115.8 0.9 1.0X [info] java 35 38 2 28402.0 0.0 25.5X [info] native 15 16 0 66691.2 0.0 59.8X ``` TODO: - [x] update documentation in `docs/` and `docs/ml-linalg-guide.md` refering `com.github.fommil.netlib` - [ ] merge luhenry/netlib#1 with all feedback from this PR + remove references to snapshot repositories in `pom.xml` and `project/SparkBuild.scala`. Closes apache#32415 from luhenry/master. Authored-by: Ludovic Henry <[email protected]> Signed-off-by: Sean Owen <[email protected]>
- Loading branch information