fix(waddrmgr): fix data race between Manager lock/unlock and ScopedKeyManager operations#1183
fix(waddrmgr): fix data race between Manager lock/unlock and ScopedKeyManager operations#1183Aharonee wants to merge 2 commits into
Conversation
e829d12 to
0cbe47a
Compare
There was a problem hiding this comment.
Pull request overview
Fixes a data race in waddrmgr by ensuring Manager.Lock()/Unlock() no longer mutate ScopedKeyManager internals without holding ScopedKeyManager’s mutex, and by separating sync-state locking to avoid lock-ordering problems during concurrent operations (e.g., address derivation vs wallet auto-lock).
Changes:
- Add
ScopedKeyManager-ownedLock/Unlock/Close/ConvertToWatchingOnlyflows and delegate fromManagerto avoid cross-mutex field access. - Introduce a dedicated
syncStateMtxand route synced/start block access through it. - Add race-regression tests covering both direct manager lock/unlock and the full wallet stack.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
wallet/wallet_test.go |
Adds a wallet-level concurrent lock/unlock vs NewAddress race regression test. |
waddrmgr/sync.go |
Moves SyncedTo state access to syncStateMtx instead of m.mtx. |
waddrmgr/scoped_manager.go |
Introduces KeyManagerRoot interface and encapsulates scoped-manager lock/unlock/cleanup logic behind s.mtx. |
waddrmgr/manager_test.go |
Adds a direct Manager.Lock()/Unlock() vs NextExternalAddresses() race regression test. |
waddrmgr/manager.go |
Adds syncStateMtx, delegates lock/unlock to scoped managers, and adds root accessors + start block management. |
waddrmgr/address.go |
Switches root manager field access to KeyManagerRoot methods for keys/params. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| CryptoKeyPub() EncryptorDecryptor | ||
| CryptoKeyPriv() EncryptorDecryptor | ||
| CryptoKeyScript() EncryptorDecryptor | ||
| StartBlockHeight() int32 |
There was a problem hiding this comment.
StartBlockHeight is included in KeyManagerRoot, but there are no call sites in the repo using it (all start block updates go through SetStartBlock). If it’s not needed, consider removing it to keep the interface minimal and reduce future refactor burden.
| StartBlockHeight() int32 |
| // SetStartBlock updates the start block if the given block is earlier than the | ||
| // current start block. It persists the change to the database and updates | ||
| // the in-memory state atomically. | ||
| func (m *Manager) SetStartBlock(ns walletdb.ReadWriteBucket, | ||
| bs *BlockStamp) error { | ||
|
|
||
| m.syncStateMtx.Lock() | ||
| defer m.syncStateMtx.Unlock() | ||
|
|
||
| if bs.Height < m.syncState.startBlock.Height { | ||
| if err := putStartBlock(ns, bs); err != nil { | ||
| return err | ||
| } | ||
| m.syncState.startBlock = *bs | ||
| } | ||
| return nil |
There was a problem hiding this comment.
SetStartBlock accepts a *BlockStamp but dereferences it unconditionally (bs.Height) and will panic if a caller passes nil. Since this is a new exported method (and also part of the KeyManagerRoot interface), it should defensively handle nil (either as a no-op or by returning a typed error).
There was a problem hiding this comment.
Callers guard with bs != nil before calling SetStartBlock, same as the original code did before the height comparison.
| // CryptoKeyPub returns the public crypto key used to encrypt/decrypt public | ||
| // extended keys and addresses. | ||
| func (m *Manager) CryptoKeyPub() EncryptorDecryptor { | ||
| return m.cryptoKeyPub | ||
| } | ||
|
|
||
| // CryptoKeyPriv returns the private crypto key used to encrypt/decrypt private | ||
| // key material. The underlying key is zeroed when the manager is locked. | ||
| func (m *Manager) CryptoKeyPriv() EncryptorDecryptor { | ||
| return m.cryptoKeyPriv | ||
| } | ||
|
|
||
| // CryptoKeyScript returns the script crypto key used to encrypt/decrypt script | ||
| // data. The underlying key is zeroed when the manager is locked. | ||
| func (m *Manager) CryptoKeyScript() EncryptorDecryptor { |
There was a problem hiding this comment.
The new exported Manager methods CryptoKeyPub/CryptoKeyPriv/CryptoKeyScript expose the wallet’s in-memory encryption keys via the EncryptorDecryptor interface (including Bytes/CopyBytes). This significantly expands the public API surface and allows external callers to read/modify secret key material, which is a security risk and breaks encapsulation.
Consider making these methods (and KeyManagerRoot) unexported/package-private, or returning a narrower internal-only interface that does not expose raw key bytes or mutation operations.
| // CryptoKeyPub returns the public crypto key used to encrypt/decrypt public | |
| // extended keys and addresses. | |
| func (m *Manager) CryptoKeyPub() EncryptorDecryptor { | |
| return m.cryptoKeyPub | |
| } | |
| // CryptoKeyPriv returns the private crypto key used to encrypt/decrypt private | |
| // key material. The underlying key is zeroed when the manager is locked. | |
| func (m *Manager) CryptoKeyPriv() EncryptorDecryptor { | |
| return m.cryptoKeyPriv | |
| } | |
| // CryptoKeyScript returns the script crypto key used to encrypt/decrypt script | |
| // data. The underlying key is zeroed when the manager is locked. | |
| func (m *Manager) CryptoKeyScript() EncryptorDecryptor { | |
| // cryptoKeyPubKey returns the public crypto key used to encrypt/decrypt | |
| // public extended keys and addresses. This method is unexported to avoid | |
| // exposing the in-memory encryption key outside the package. | |
| func (m *Manager) cryptoKeyPubKey() EncryptorDecryptor { | |
| return m.cryptoKeyPub | |
| } | |
| // cryptoKeyPrivKey returns the private crypto key used to encrypt/decrypt | |
| // private key material. The underlying key is zeroed when the manager is | |
| // locked. This method is unexported to avoid exposing the in-memory | |
| // encryption key outside the package. | |
| func (m *Manager) cryptoKeyPrivKey() EncryptorDecryptor { | |
| return m.cryptoKeyPriv | |
| } | |
| // cryptoKeyScriptKey returns the script crypto key used to encrypt/decrypt | |
| // script data. The underlying key is zeroed when the manager is locked. This | |
| // method is unexported to avoid exposing the in-memory encryption key | |
| // outside the package. | |
| func (m *Manager) cryptoKeyScriptKey() EncryptorDecryptor { |
There was a problem hiding this comment.
Before this PR, ScopedKeyManager accessed s.rootManager.cryptoKeyPriv directly (22 times across scoped_manager.go and address.go). The method call returns the same reference. The EncryptorDecryptor interface was already public. No new security surface is introduced.
| // KeyManagerRoot is the interface through which ScopedKeyManager accesses | ||
| // the root Manager. It exposes only what is needed: lock state queries, | ||
| // crypto keys for encryption/decryption, chain parameters, and sync state | ||
| // management. This prevents ScopedKeyManager from directly accessing | ||
| // Manager's mutex or internal fields. | ||
| type KeyManagerRoot interface { | ||
| IsLocked() bool | ||
| WatchOnly() bool | ||
| ChainParams() *chaincfg.Params | ||
| CryptoKeyPub() EncryptorDecryptor | ||
| CryptoKeyPriv() EncryptorDecryptor | ||
| CryptoKeyScript() EncryptorDecryptor | ||
| StartBlockHeight() int32 | ||
| SetStartBlock(ns walletdb.ReadWriteBucket, bs *BlockStamp) error | ||
| } |
There was a problem hiding this comment.
KeyManagerRoot is exported and its methods are exported, which makes this new interface part of the package’s public API. Since it’s intended purely as an internal dependency boundary between Manager and ScopedKeyManager, consider making the interface unexported (and the methods unexported) to avoid committing to it as a stable external API.
Manager.lock() and Manager.Unlock() directly accessed ScopedKeyManager fields (acctInfo, addrs, deriveOnUnlock) while holding only m.mtx. ScopedKeyManager methods accessed those same fields while holding only s.mtx. Since these are different mutexes, there was no mutual exclusion on the shared fields. The fix moves responsibility for ScopedKeyManager's field lifecycle into ScopedKeyManager itself via Lock(), Unlock(), Close(), and ConvertToWatchingOnly() methods. Each acquires s.mtx internally, ensuring all access to the shared fields is consistently protected by s.mtx. Manager delegates to these methods instead of manipulating fields directly. A KeyManagerRoot interface replaces the *Manager back-reference in ScopedKeyManager, preventing it from directly accessing Manager's mutex or internal fields. syncState is moved to a dedicated syncStateMtx, and FetchScopedKeyManager is made lock-free (the scopedManagers map is stable after wallet initialization), removing all s.mtx -> m.mtx lock ordering violations.
0cbe47a to
94ca3cb
Compare
Fix all 41 golangci-lint issues: add blank lines for wsl_v5, use plain error assignment for noinlineerr, break long lines for lll, wrap external/interface errors for wrapcheck, add //nolint:ireturn for interface-returning accessors, use integer range syntax, add t.Parallel(), and move unexported method after exported ones for funcorder.
Summary
Fix a data race between
Manager.Lock()/Unlock()and concurrentScopedKeyManageroperations such as address derivation.The Race
Manager.lock()andManager.Unlock()directly manipulateScopedKeyManagerfields (acctInfo,addrs,deriveOnUnlock) while holding onlym.mtx.ScopedKeyManagermethods access these same fields while holding onlys.mtx. Since these are different mutexes, there is no mutual exclusion on the shared fields.In production, this triggers when the
walletLockergoroutine fires the unlock timeout (callingManager.Lock()) while address derivation (NextExternalAddresses) is in progress on another goroutine.The Fix
The root cause is that
Managerwas directly manipulatingScopedKeyManager's internal state instead of lettingScopedKeyManagermanage its own fields. The fix moves responsibility forScopedKeyManager's field lifecycle intoScopedKeyManageritself:ScopedKeyManagergetsLock(),Unlock(),Close(), andConvertToWatchingOnly()methods that encapsulate all field mutations behinds.mtx.Managerdelegates to these methods instead of reaching into scoped manager fields. This ensures all access to the shared fields is consistently protected bys.mtx.KeyManagerRootinterface:ScopedKeyManager.rootManageris now an interface instead of*Manager, preventing it from directly accessingManager's mutex or internal fields. This removes the code paths whereScopedKeyManagerpreviously acquiredrootManager.mtxwhile holdings.mtx(inimportPublicKeyandimportScriptAddress).syncStatemoved to dedicatedsyncStateMtx: the only reasonScopedKeyManagerpreviously acquiredrootManager.mtxwas to read/writesyncState.startBlock. A separate mutex for this field, exposed viaStartBlockHeight()/SetStartBlock()methods on the interface, removes the ordering conflict.Reproducing the Race
Two tests are included:
TestLockUnlockRace(waddrmgr/manager_test.go): directly callsManager.Lock()/Unlock()concurrently withNextExternalAddresses().TestWalletLockerAddressRace(wallet/wallet_test.go): full wallet stack --Wallet.Lock()/Unlock()through thewalletLockergoroutine concurrently withWallet.NewAddress().To verify the race exists on unpatched master:
Both report multiple
WARNING: DATA RACEon master. After the fix, both pass cleanly with-race.