mirror of https://github.com/Nonannet/copapy.git
Readme updated with benchmark result
This commit is contained in:
parent
00c825b207
commit
d9f361a6d6
|
|
@ -34,9 +34,14 @@ Currently in development:
|
||||||
- Support for Thumb instructions required by ARM*-M targets (for MCUs)
|
- Support for Thumb instructions required by ARM*-M targets (for MCUs)
|
||||||
- Constant regrouping for further symbolic optimization of the computation graph
|
- Constant regrouping for further symbolic optimization of the computation graph
|
||||||
|
|
||||||
|
Despite missing SIMD-optimization, benchmark performance shows promising numbers. The following chart plots the results in comparison to NumPy 2.3.5:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
|
For the benchmark (`tests/benchmark.py`) the timing of 30000 iterations for calculating the therm `sum((v1 + i) @ v2 for i in range(10))` where measured on an Ryzen 5 3400G. Where the vectors `v1` and `v2` both have a lengths of `v_size` which was varied according to the chart from 10 to 600. For the NumPy case the "i in range(10)" loop was vectorized like this: `np.sum((v1 + i) @ v2)` with i being here a `NDArray` with a dimension of `[10, 1]`. The number of calculated scalar operations is the same for both contenders. Obviously copapy profits from less overheat by calling a single function from python per iteration, where the NumPy variant requires 3. Interestingly there is no indication visible in the chart that for increasing `v_size` the calling overhead for NumPy will be compensated by using faster SIMD instructions.
|
||||||
|
|
||||||
|
Furthermore for many applications copypy will benefit by reducing the actual number of operations significantly compared to a NumPy implementation, by precompute constant values know at compile time and benefiting from sparcity. Multiplying by zero (e.g. in a diagonal matrix) eliminate a hole branch in the computation graph. Operations without effect, like multiplications by 1 oder additions with zero gets eliminated at compile time.
|
||||||
|
|
||||||
## Install
|
## Install
|
||||||
|
|
||||||
To install Copapy, you can use pip. Precompiled wheels are available for Linux (x86_64, AArch64, ARMv7), Windows (x86_64) and macOS (x86_64, AArch64):
|
To install Copapy, you can use pip. Precompiled wheels are available for Linux (x86_64, AArch64, ARMv7), Windows (x86_64) and macOS (x86_64, AArch64):
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue