Done issues

RocksDB performance problems
PAN-3245
Investigate if it's possible to not trigger Jenkins on PR renames
PAN-3181
1.4 Performance Testing
PAN-3089
Performance Improvement Meta Epic
PAN-3003
replicate the changes to all other sites based on EthSigner
PAN-3281
Add a setPrivacyGroupState endpoint on Pantheon
PAN-3106
Can't get mainnet sync to complete
PAN-3280
Low gas limit causes root mismatch with private transactions
PAN-3243
Besu behaviour when Orion not available to process a private transaction
PAN-3210
metrics-category details incorrect
PAN-3250
Detailed design for onchain privacy group management
PAN-3199
Modify priv_getTransactionReceipt result fields
PAN-2984
Besu do not understang web3j ping request and answer with invalid request
PAN-3279
Include privateFrom in RPC parameters rather than inject from CLI key
PAN-3167
Feature Request: include node address in `admin_nodeInfo` result
PAN-3278
Title within Configure High Availability of APIs documentation tutorial has a spelling mistake
PAN-3276
Fix broken links in Academy course
PAN-3264
Doc how to install and build plugins
PAN-3078
Fast Sync Meta Epic
PAN-3002
Reorganise docs navigation
PAN-2938
Limit what types of transactions and what each account can do
PAN-2834
Permissioning check to only connect to nodes running a specified client/version
PAN-2833
[spike] investigate switching timers to histograms in PropetheusMetricsSystem
PAN-2440
`besu` module in Besu project could be refactored to reduce the monolithic aspect and size of BesuCommand class and also streamline the things that were added along Besu CLI life.
PAN-2192
Acceptance tests to cover node and account permissioning functionality
PAN-2728
Discovery and Scoping related work around JSON RPC
PAN-1082
1.2 Tech Debt
PAN-3018
Doc Privacy 1.2
PAN-2504
Truffle integration
PAN-2806
Ensure that we have enough of clique implemented that we can sync with the Rinkeby network.
PAN-1391
Move from GitHub Wiki to New Publishing System
PAN-1834
Node and Account blacklist
PAN-2185
1.3 Monitoring
PAN-3041
Pantheon JSON RPC Placeholder
PAN-1081
Implement WebSockets
PAN-1182
iBFT Integration Tests
PAN-2087
Mine Blocks Using Clique
PAN-1392
JSON RPC APIs to support mining.
PAN-1153
Permissioning doc 1.2
PAN-2880
Authenticated JSON RPC API
PAN-1961
Monitor Stratum Bounty
PAN-2993
Doc pruning
PAN-3049
Prevent PrivacyGroup data from being modified by a non-member
PAN-2913
Define behaviour of admin_addPeer regarding PeerTable
PAN-2501
Fix typo in Besu Client Library documentation
PAN-3277
Make pruning/fast-sync mutually exlusive with privacy-enabled
PAN-3275
Expose & Enable Pruning by Default on Fast Sync
PAN-3084
Update permissioning repo links
PAN-3174
Doc test_getLogHash
PAN-3058
Doc CLI option to set target gas limit
PAN-3173
issue 1 of 1410

RocksDB performance problems

Description

We have been experiencing performance issues since pantheon 1.1.3 due the rocksdb read performance.

Our context is the following:

  • We use Eventeum as the event listener platform for BESU. We are core committers of the project.

  • Eventeum internally stores a kind of index, to store which was the latestblock read, to get all the events happening on the chain

  • When eventeum stops for a while, either by a planned maintenance, configuration change, or any read problem at the node level, being the node able to get new blocks from the chain, but not able to serve any query to eventeum, eventeum is not synced with the head block of the chain

  • When eventeum starts it creates logs per any event filte via eth_newFilter, syncs the pending information via eth_getFilterLogs, and then it uses eth_getFilterChanges to get updates when on sync

We are seeing that when for whatever reason, the amount of unsynced blocks si higher than 10000, the performance drops, 100k of blocks, makes BESU hanf at query level locking rocks db with traces like "Thread Thread[vert.x-worker-thread-18,5,main] has been blocked for 76419 ms, time limit is 60000 ms"

Based on this scenario we have done some performance checks compared to goquorum

  • 10k blocks: BESU = 5 secs, QUORUM = 0,9 secs

  • 50k blocks: BEDU = 10 secs , Quorum = 4 sec
    -100k blocks: BESU = 20 sec, Quorum = 10secs

> 100k BESU rocksdb threads blocks, quorum takes some time but answers

Its strange based on a comparison between leveldb and rocksdb.

With this scenario, we really need and urgent solution on it, to boost the performance.

We have several things, menawhile , on mind:

  • Monitor rocks db with the following metrics : "Latency for read from RocksDB.", "Latency of remove requests from RocksDB.", "Latency for write to RocksDB.", "Latency for commits to RocksDB."), raising alerts when it is underperforming. The problem is that we cannot see that metrics on the metrics endpoint neither at 1.2.2 or 1.3.1. Can you review the following bug? Any suggestion to monitor rocksdb

  • At eventeum level, split the unsynced blocks, in a number of chunks based on a kinf of window. any suggestion on the mas window to include?

Kind regards

Environment

None

Status

Assignee

Danno Ferrin

Reporter

Fernando Paris

Labels

None

Scrum Team

Chupacabra

Refinement State

Not Started

Components

Sprint

None

Fix versions

Priority

P3
Configure