Skip to content

perf: copy optimization#799

Open
jodavies wants to merge 1 commit intoform-dev:masterfrom
jodavies:memmove
Open

perf: copy optimization#799
jodavies wants to merge 1 commit intoform-dev:masterfrom
jodavies:memmove

Conversation

@jodavies
Copy link
Collaborator

Here is an optimisation experiment, replacing all NCOPY/WCOPY macros with memmove (we can't be sure that memory regions never overlap in use of the macro for memcpy). The replacement alone is a negligible performance improvement (tentatively 1%?) but it is hard to detect within the usual run-to-run variation.

The followup commits improve some existing copies within the code by using the macros instead, and moving some conditionals outside of the copies. I identified the expensive copies with a profiler running the Forcer benchmark.

On my system (Ryzen 7900X, Ubuntu 24.04, GCC 13.3.0, tform -w12,), the results for the usual benchmarks are as follows:

Benchmark Speedup w.r.t. v5.0.0
chromatic 1.05 ± 0.01
color 1.01 ± 0.01
fmft 1.03 ± 0.01
forcer 1.07 ± 0.00
forcer-exp 1.08 ± 0.01
mass-fact 1.00 ± 0.05
mbox1l 1.01 ± 0.02
minceex 1.07 ± 0.02
mincer 1.00 ± 0.05
sort-disk 0.98 ± 0.02
sort-large 0.99 ± 0.01
sort-small 1.01 ± 0.01
trace 1.02 ± 0.01

@vermaseren
Copy link
Collaborator

vermaseren commented Feb 15, 2026 via email

@coveralls
Copy link

coveralls commented Feb 16, 2026

Coverage Status

coverage: 58.029% (-0.01%) from 58.043%
when pulling c5931f9 on jodavies:memmove
into c134010 on form-dev:master.

Hoist conditionals out of some data copying loops or simplify while loop
termination conditions. Use of memmove does not measurably affect performance,
leave a comment about this.
@jodavies
Copy link
Collaborator Author

I ran benchmarks with more samples, I think indeed the use of memmove doesn't lead to a measurable performance difference. I cleaned up the commits to include only the obvious wins and use the original macros.

This results in 6-7% improvement for Forcer, 3% for mincer-exact, 0-1% for everything else.

@jodavies jodavies changed the title WIP copy optimisation perf: copy optimization Feb 26, 2026
@tueda
Copy link
Collaborator

tueda commented Feb 26, 2026

Coveralls is completely down... See: https://status.coveralls.io/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants