Logocpp-vs-torch
[Data Museum]GitHub

The Museum

Raw Benchmark Data and Glossary

Dataset Browser

Language Filter:
NLanguageKernelAvg Time (s)GFLOPS
10cnaive0.000001N/A
10ctiled0.000001N/A
10csimd0.000001N/A
20cnaive0.000006N/A
20ctiled0.000005N/A
20csimd0.000001N/A
30cnaive0.000017N/A
30ctiled0.000016N/A
30csimd0.000004N/A
40cnaive0.000039N/A
40ctiled0.000037N/A
40csimd0.000006N/A
50cnaive0.000076N/A
50ctiled0.000068N/A
50csimd0.000012N/A
60cnaive0.000134N/A
60ctiled0.000119N/A
60csimd0.000022N/A
70cnaive0.000234N/A
70ctiled0.000196N/A
70csimd0.000040N/A
80cnaive0.000361N/A
80ctiled0.000292N/A
80csimd0.000045N/A
90cnaive0.000529N/A
90ctiled0.000406N/A
90csimd0.000063N/A
100cnaive0.000735N/A
100ctiled0.000572N/A
100csimd0.000086N/A
200cnaive0.006440N/A
200ctiled0.004566N/A
200csimd0.000732N/A
300cnaive0.022219N/A
300ctiled0.015264N/A
300csimd0.002410N/A
400cnaive0.053575N/A
400ctiled0.036242N/A
400csimd0.006210N/A
500cnaive0.104916N/A
500ctiled0.070800N/A
500csimd0.011402N/A
600cnaive0.183185N/A
600ctiled0.122488N/A
600csimd0.021581N/A
700cnaive0.291707N/A
700ctiled0.194633N/A
700csimd0.032773N/A
800cnaive0.496373N/A
800ctiled0.290745N/A
800csimd0.050909N/A
900cnaive0.676366N/A
900ctiled0.415413N/A
900csimd0.068217N/A
1000cnaive0.934128N/A
1000ctiled0.569557N/A
1000csimd0.097907N/A
2000cnaive9.735280N/A
2000ctiled4.590740N/A
2000csimd0.874592N/A
10cppnaive0.000001N/A
10cpptiled0.000000N/A
10cppsimd0.000001N/A
20cppnaive0.000003N/A
20cpptiled0.000003N/A
20cppsimd0.000003N/A
30cppnaive0.000010N/A
30cpptiled0.000009N/A
30cppsimd0.000008N/A
40cppnaive0.000022N/A
40cpptiled0.000020N/A
40cppsimd0.000010N/A
50cppnaive0.000046N/A
50cpptiled0.000037N/A
50cppsimd0.000019N/A
60cppnaive0.000088N/A
60cpptiled0.000064N/A
60cppsimd0.000032N/A
70cppnaive0.000147N/A
70cpptiled0.000108N/A
70cppsimd0.000055N/A
80cppnaive0.000237N/A
80cpptiled0.000161N/A
80cppsimd0.000062N/A
90cppnaive0.000367N/A
90cpptiled0.000218N/A
90cppsimd0.000096N/A
100cppnaive0.000499N/A
100cpptiled0.000307N/A
100cppsimd0.000131N/A
200cppnaive0.005669N/A
200cpptiled0.002490N/A
200cppsimd0.000967N/A
300cppnaive0.020533N/A
300cpptiled0.008257N/A
300cppsimd0.003722N/A
400cppnaive0.050361N/A
400cpptiled0.019467N/A
400cppsimd0.008130N/A
500cppnaive0.100307N/A
500cpptiled0.037864N/A
500cppsimd0.016425N/A
600cppnaive0.175949N/A
600cpptiled0.065575N/A
600cppsimd0.030589N/A
700cppnaive0.284257N/A
700cpptiled0.104285N/A
700cppsimd0.044208N/A
800cppnaive0.429051N/A
800cpptiled0.156286N/A
800cppsimd0.069136N/A
900cppnaive0.607257N/A
900cpptiled0.222535N/A
900cppsimd0.092915N/A
1000cppnaive0.835162N/A
1000cpptiled0.305381N/A
1000cppsimd0.134594N/A
2000cppnaive8.484150N/A
2000cpptiled2.477330N/A
2000cppsimd1.125940N/A
10pytorchaten0.000002N/A
10numpyopenblas/mkl0.000001N/A
20pytorchaten0.000002N/A
20numpyopenblas/mkl0.000001N/A
30pytorchaten0.000003N/A
30numpyopenblas/mkl0.000002N/A
40pytorchaten0.000003N/A
40numpyopenblas/mkl0.000003N/A
50pytorchaten0.000005N/A
50numpyopenblas/mkl0.000004N/A
60pytorchaten0.000005N/A
60numpyopenblas/mkl0.000006N/A
70pytorchaten0.000009N/A
70numpyopenblas/mkl0.000009N/A
80pytorchaten0.000010N/A
80numpyopenblas/mkl0.000011N/A
90pytorchaten0.000018N/A
90numpyopenblas/mkl0.000015N/A
100pytorchaten0.000017N/A
100numpyopenblas/mkl0.000020N/A
200pytorchaten0.000128N/A
200numpyopenblas/mkl0.000132N/A
300pytorchaten0.000412N/A
300numpyopenblas/mkl0.000418N/A
400pytorchaten0.000981N/A
400numpyopenblas/mkl0.000961N/A
500pytorchaten0.001851N/A
500numpyopenblas/mkl0.001839N/A
600pytorchaten0.003142N/A
600numpyopenblas/mkl0.003102N/A
700pytorchaten0.004952N/A
700numpyopenblas/mkl0.005011N/A
800pytorchaten0.007352N/A
800numpyopenblas/mkl0.007334N/A
900pytorchaten0.010359N/A
900numpyopenblas/mkl0.010355N/A
1000pytorchaten0.014185N/A
1000numpyopenblas/mkl0.014203N/A
2000pytorchaten0.115306N/A
2000numpyopenblas/mkl0.112743N/A

Glossary

AVX2
Advanced Vector Extensions 2. A 256-bit SIMD instruction set native to Intel and AMD processors.
Bump Allocator
A memory allocation strategy where memory is grabbed in one huge chunk, and subsequent allocations just move a pointer forward. No free lists, no OS jitter.
GFLOPS
Giga-Floating Point Operations Per Second. The primary metric for compute throughput.
L1 Cache
The absolute fastest, but smallest (32-64KB), memory bank located directly on the CPU core.
OpenMP
Open Multi-Processing. An API that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran.