Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Language Features:

Compiler Features:
* General: Speed up SHA-256 hashing (`picosha2`).
* General: Speed up compilation times.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, should probably mention that this particular improvement is due to optimizations in the CSE and dataflow analysis. In any case, let's wait until @cameel takes a look, he'll probably suggest a final version of the changelog entry phrasing.


Bugfixes:
* NatSpec: Disallow `@return` tag in event documentation.
Expand Down
8 changes: 7 additions & 1 deletion libyul/optimiser/CommonSubexpressionEliminator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@
#include <libyul/Dialect.h>
#include <libyul/Utilities.h>

#include <algorithm>

using namespace solidity;
using namespace solidity::yul;
using namespace solidity::util;
Expand Down Expand Up @@ -129,6 +131,10 @@ void CommonSubexpressionEliminator::visit(Expression& _e)
void CommonSubexpressionEliminator::assignValue(YulName _variable, Expression const* _value)
{
if (_value)
m_replacementCandidates[*_value].insert(_variable);
{
auto& candidates = m_replacementCandidates[*_value];
if (std::find(candidates.begin(), candidates.end(), _variable) == candidates.end())
candidates.emplace_back(_variable);
Comment on lines +135 to +137
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will affect the order in which candidates are iterated in, so could/will affect the bytecode in the sense that it may now pick a different (although still valid) (sub)expression to be substituted. There are no guarantees on our side as to what the bytecode will actually look like as long as it's semantically correct, so I'm pretty sure this isn't a problem. @cameel can you double check please - this is the only change where I'm slightly uncertain?

Copy link
Copy Markdown
Contributor Author

@DanielVF DanielVF May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

candidates here is an order preserving vector, so the iteration and subsequent emplace_back is deterministic.

m_replacementCandidates is the std:unordered_set with a key type of expression and value type of vector. We just use the unordered_set to find the vector of previously found candidates. Then, this code is just appending to that ordered vector when a new candidate is found. So this particular code should be safe.

Even if the candidates vector were unordered, this check is just checking for presence, which is deterministic regardless of order.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're still iterating over the values in the vector (previously set), and the two have different iteration orders (set sort vs insertion), which makes it possible for different (but equivalent) expressions/values to be used as replacements. I'm not saying this is wrong or undesired behaviour, just that it could result in different bytecode between the two approaches, which is why I asked @cameel to double check. From my point of view, this is perfectly fine (but I have been wrong before).

Copy link
Copy Markdown
Contributor Author

@DanielVF DanielVF May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In pseudo javascript:

m_replacementCandidates = {}

function insert(value, expression){
  candidates = m_replacementCandidates[value] // plus magic to insert array if not present
  if(!candidates.find(expression)){
     candidates.push(expression)
  }
}

Copy link
Copy Markdown
Contributor Author

@DanielVF DanielVF May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - now I understand! The set might have been sorted by something different than its insertion order, even if both are deterministic.

Copy link
Copy Markdown
Contributor Author

@DanielVF DanielVF May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the important bit is not backwards compatible bytecode, but consistent bytecode output for this compiler version. In that case, a different sorting order would be fine, as long as it's deterministic. And this is definitely that.

That said, no tests have needed to change as a result of this, and my local benchmarking on a set of real world contracts has identical bytecode out with this change, as without it.

}
DataFlowAnalyzer::assignValue(_variable, _value);
}
7 changes: 4 additions & 3 deletions libyul/optimiser/CommonSubexpressionEliminator.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@
#include <libyul/optimiser/SyntacticalEquality.h>
#include <libyul/optimiser/BlockHasher.h>

#include <set>
#include <unordered_set>
#include <vector>

namespace solidity::yul
{
Expand Down Expand Up @@ -62,10 +63,10 @@ class CommonSubexpressionEliminator: public DataFlowAnalyzer

void assignValue(YulName _variable, Expression const* _value) override;
private:
std::set<YulName> m_returnVariables;
std::unordered_set<YulName> m_returnVariables;
std::unordered_map<
std::reference_wrapper<Expression const>,
std::set<YulName>,
std::vector<YulName>,
ExpressionHash,
SyntacticallyEqualExpression
> m_replacementCandidates;
Expand Down
19 changes: 13 additions & 6 deletions libyul/optimiser/DataFlowAnalyzer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ void DataFlowAnalyzer::operator()(VariableDeclaration& _varDecl)
std::set<YulName> names;
for (auto const& var: _varDecl.variables)
names.emplace(var.name);
m_variableScopes.back().variables += names;
m_variableScopes.back().variables.insert(names.begin(), names.end());

if (_varDecl.value)
{
Expand Down Expand Up @@ -346,17 +346,24 @@ void DataFlowAnalyzer::clearValues(std::set<YulName> const& _variablesToClear)
_variablesToClear.count(_item.second);
});

// Also clear variables that reference variables to be cleared.
std::set<YulName> referencingVariablesToClear;
// Also collect variables that directly reference variables to be cleared.
// Keep the original variables separate from the referencing ones; duplicate erases are harmless.
std::vector<YulName> referencingVariablesToClear;
std::vector const sortedVariablesToClear(_variablesToClear.begin(), _variablesToClear.end());
for (auto const& [referencingVariable, referencedVariables]: m_state.sortedReferences)
// instead of checking each variable in `referencedVariables`, we check if there is any intersection making use of the
// sortedness of the vectors, which can increase performance by up to 50% in pathological cases
if (hasNonemptyIntersectionSorted(referencedVariables, sortedVariablesToClear))
referencingVariablesToClear.emplace(referencingVariable);
referencingVariablesToClear.emplace_back(referencingVariable);

// Clear the value and update the reference relation.
for (auto const& name: _variablesToClear + referencingVariablesToClear)
// Clear values and reference information for the requested variables and their direct dependents.
// Duplicate erases are safe and are faster than a separate deduplication pass
for (auto const& name: _variablesToClear)
{
m_state.value.erase(name);
m_state.sortedReferences.erase(name);
}
for (auto const& name: referencingVariablesToClear)
{
m_state.value.erase(name);
m_state.sortedReferences.erase(name);
Expand Down
7 changes: 4 additions & 3 deletions libyul/optimiser/DataFlowAnalyzer.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@

#include <map>
#include <set>
#include <unordered_set>

namespace solidity::yul
{
Expand Down Expand Up @@ -105,7 +106,7 @@ class DataFlowAnalyzer: public ASTModifier
/// @returns the current value of the given variable, if known - always movable.
AssignedValue const* variableValue(YulName _variable) const { return util::valueOrNullptr(m_state.value, _variable); }
std::vector<YulName> const* sortedReferences(YulName _variable) const { return util::valueOrNullptr(m_state.sortedReferences, _variable); }
std::map<YulName, AssignedValue> const& allValues() const { return m_state.value; }
std::unordered_map<YulName, AssignedValue> const& allValues() const { return m_state.value; }
std::optional<YulName> storageValue(YulName _key) const;
std::optional<YulName> memoryValue(YulName _key) const;
std::optional<YulName> keccakValue(YulName _start, YulName _length) const;
Expand Down Expand Up @@ -178,7 +179,7 @@ class DataFlowAnalyzer: public ASTModifier
struct State
{
/// Current values of variables, always movable.
std::map<YulName, AssignedValue> value;
std::unordered_map<YulName, AssignedValue> value;
/// m_references[a].contains(b) <=> the current expression assigned to a references b
/// The mapped vectors _must always_ be sorted
std::unordered_map<YulName, std::vector<YulName>> sortedReferences;
Expand Down Expand Up @@ -213,7 +214,7 @@ class DataFlowAnalyzer: public ASTModifier
struct Scope
{
explicit Scope(bool _isFunction): isFunction(_isFunction) {}
std::set<YulName> variables;
std::unordered_set<YulName> variables;
bool isFunction;
};
/// Special expression whose address will be used in m_value.
Expand Down
Loading