Skip to content

Code Generation: Constant creation with NOT(0) + shifts#16729

Open
DanielVF wants to merge 7 commits into
argotorg:developfrom
DanielVF:moh-eulith-constants
Open

Code Generation: Constant creation with NOT(0) + shifts#16729
DanielVF wants to merge 7 commits into
argotorg:developfrom
DanielVF:moh-eulith-constants

Conversation

@DanielVF
Copy link
Copy Markdown
Contributor

@DanielVF DanielVF commented May 15, 2026

Description

Right or left aligned constants in 12 gas, 5 bytes, vs current 18 gas, 9 bytes.

// ZERO NOT - creates a stack with all one bits
// 0XFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
// PUSH1 {immediate} SHR - slides one bits over to create the desired constant
// 0X000000000000000000000000000000000000FFFFFFFFFFFFFFFFFFFFFFFFFFFF

PR #15935 proposed using NOT(0) and then shifts to create constants. This is more bytecode efficient than the current computed constants in most situations. This results in 2.9% improvement in optimized (200 runs) bytecode size, as well as a slight average gas improvement. via-ir optimization results show a similar improvement.

The underlying concepts are sound, but I wanted the code to reflect that underlying simplicity. This PR:

  • Rebases off the latest develop branch and updates all tests
  • Refactors the core logic for clarity and to make safety obvious. Variable names have been renamed, loops are obviously bounded, and comments updated.

The behavior should be identical between both PRs. Both PR's result in identical benchmark gas usage and bytecode size. I also wrote a separate harness (not included in PR) to test this algorithm for correctness against each possible contiguous run of 1s.

Why we should do this:

There are a lot of ways to optimize constant creation or masking operations. I think this is the one that should land immediately:

  • It's the simplest of all the approaches. It only modifies constant creation, so basically 25 lines in an if statement, duplicated twice in the code, because constants are optimized into places. All effects are local.
  • Everything that happens on the yul constant creation does is immediately run through a tiny vm which verifies that each constant replacement actually creates the correct value.
  • This optimization slightly stacks with future optimizations. Because this handles arbitrary runs of 1s, this PR would still provide benefits in certain cases even if the more efficient memory constant or double shift masking were implemented. Effort here will not be thrown away.

Checklist

AI Disclosure

  • No AI tools were used
  • AI tools were used (details below)

Usage:

  • Codex 5.5 was used for a first discussion of clarity approaches.
  • The yul code was copied from my own human modified constant optimizer code.

@DanielVF
Copy link
Copy Markdown
Contributor Author

I meant to have all commits show as their original commit author. They do, but only if you click on them. Full credit for this optimization approach goes to @moh-eulith.

@DanielVF
Copy link
Copy Markdown
Contributor Author

I'll go through the failing tests and update them as needed.

@DanielVF DanielVF marked this pull request as draft May 16, 2026 10:41
@DanielVF DanielVF marked this pull request as ready for review May 18, 2026 12:51
@cameel cameel requested a review from blishko May 18, 2026 13:38
@DanielVF DanielVF changed the title Better constant creation with NOT(0) + shifts Code Generation: Constant creation with NOT(0) + shifts May 18, 2026
@DanielVF
Copy link
Copy Markdown
Contributor Author

Stats from the output of c_ext_benchmarks.

Bytecode size

Contract ir-no-optimize ir-optimize-evm+yul ir-optimize-evm-only legacy-no-optimize legacy-optimize-evm+yul legacy-optimize-evm-only
brink -0.05% -2.13% -1.67% +0.00% -2.73% -2.71%
colony +0.00% -2.39% -2.05% +0.00% -2.39% -2.05%
elementfi -0.15% -1.77% -0.99% +0.00% -1.80% -1.53%
ens -0.17% -2.48% -1.44% +0.00% -2.13% -1.86%
euler n/a -3.58% n/a +0.00% -2.81% -2.38%
gnosis n/a n/a n/a +0.00% -2.58% -2.09%
gp2 -0.11% -2.10% -1.09% +0.00% -2.02% -1.61%
pool-together -0.16% -2.74% -1.40% +0.00% -2.74% -2.31%
uniswap -0.38% -3.35% -1.76% +0.00% -3.22% -2.88%
yield_liquidator -0.36% -3.83% -1.98% +0.00% -3.65% -3.10%
zeppelin n/a -2.73% -1.59% n/a -2.61% n/a

Gas

Contract ir-no-optimize ir-optimize-evm+yul ir-optimize-evm-only legacy-no-optimize legacy-optimize-evm+yul legacy-optimize-evm-only
ens n/a -0.07% -0.19% n/a -0.08% -0.08%
euler n/a -0.49% n/a +0.00% -0.68% -0.64%
uniswap -0.28% -3.03% -1.54% +0.00% -2.87% -2.61%
yield_liquidator -0.03% -0.07% -0.24% +0.00% -0.13% -0.13%
zeppelin n/a -1.09% n/a n/a -1.36% n/a

Copy link
Copy Markdown
Contributor

@blishko blishko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have glanced over the code and I am generally in favour of this change.

It seems this could be further split into two separate changes and we could evaluate them separately.

One thing I don't like is the duplication of the code in the evmasm optimizer and in the Yul optimizer.
I think we would ideally implement this change only at the evmasm level.
(I think we could actually rip out the whole Yul constant optimizer, see #16738.)

Comment on lines +309 to +317
// pure negation can sometimes produce bad results
// example: 0xff00000000000000000000000000000000000000000000000000000000000000
// 0xff at the most significant byte of u256
// without the extra condition: not(sub(shl(0xf8, 0x01), 0x01))
// the extra condition turns that into: shl(0xf8, 0xff)
if (
numberEncodingSize(~_value) < numberEncodingSize(_value) &&
(onesEnd < 256 || onesEnd - onesStart > 16)
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting!
We could try to separate this change and see what effect we get from this small change.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this depends on the computed mask variables (onesEnd, onesStart), so separating it would require a chunk of code duplicated in both PRs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I am suggesting is to keep the computation of onesEnd and onesStart, but skip the computation of newRoutine. Do you expect that would still provide some benefit?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would provide a tiny benefit for the rare cases like the given example (easy to see in a test case, hardly move the needle for real contracts). IMHO, not worth a separate PR.

P.S. Did you notice yul optimizer doesn't need this? It's because it's structurally better than the libevmasm version (yul compares different computations; libevasm compares a single computation with other choices, like DATACOPY, so the single computation has to have extra conditionals).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. Did you notice yul optimizer doesn't need this? It's because it's structurally better than the libevmasm version (yul compares different computations; libevasm compares a single computation with other choices, like DATACOPY, so the single computation has to have extra conditionals).

I definitely see some differences. If yul version does something better, the evmasm version could be improved to do the same thing, no?

In any case, no need to separate this into a new PR.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the yul optimizer on top of the not(0) constant optimizer makes the yul code even more optimized

That's weird. The tests included in this repo's CI tell a different story. When I added the not(0) optimizations to the yul path, there were improvements. Compare
#15935 (comment)
and
#15935 (comment)

For example, "colony" went from -1.91 to -2.33, etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is using the not(0) constant optimizer as the baseline here, so it's comparing the not(0) constant optimizer PR alone (0 changes by definition) vs having both the not(0) constant optimizer and removing the yul constant optimizer (the spat version).

The not(0) constant optimizer is definitely a win in either case.

Copy link
Copy Markdown
Contributor Author

@DanielVF DanielVF May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I now get it, and now I also understand why on the performance optimization side I've been seeing things that only speed up yul speed up the regular compiler too.

appendYulUtilityFunctions() is called by appendMissingFunctions(), and places yul code into contracts. Doing this runs the yul side, including yul optimizers, even on non ir-contracts.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moh-eulith Here's the chart, with both against a baseline of the current development branch. This is probably what you were expecting.

image

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume "splat" is without the yul optimizer. I think the code this is benchmarking against matters. I just tried this on the my own contracts: removing yul-constant-optimizer increased the contract total size from 61,593 to 61,903 (I compile with --via-ir --optimize --optimize-runs 2 so presumably similar to your via-ir-low-runs).

@blishko 's PR shows the same thing (#16738 (comment)) where ir-optimize-evm+yul has some regressions and some improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants