Conversation
| ctx2, cancel := context.WithTimeout(ctx, 3*time.Second) | ||
| defer cancel() | ||
|
|
||
| _, err := db.Conn.ExecContext(ctx2, `DELETE FROM ingestor_state WHERE id = 1`) |
There was a problem hiding this comment.
Both these operations should be done atomically in a transaction.
| lastBN, found := db.LoadLastBlockNumber(ctx) | ||
| if !found || lastBN == 0 { | ||
| log.Printf(" [INIT] No previous state found, checking database for latest block...") | ||
| err := db.Conn.QueryRowContext(ctx, `SELECT COALESCE(MAX(block_number),0) FROM blocks`).Scan(&lastBN) |
There was a problem hiding this comment.
If the point of the DB is to mask the underlying storage layer, we shouldnt be accessing the Conn directly. Also if the DB layer is changed, for eg a new table instead of blocks is added, this will also have to be changed. The Conn should be unexported. Any functionality should be provided with the help of functions.
|
|
||
| // Validate required configuration | ||
|
|
||
| log.Printf("[INIT] Starting blockchain indexer with StarRocks database") |
There was a problem hiding this comment.
In all other parts of the code we use slog. This outputs structured json logs which are easier to parse. It has option for text logging also and it can be configured. The other services also use this and we have some util pkg also. Better to start using this in the codebase.
All this manual [COMPONENT] can be added using log tags. This gives us the ability to filter using the tag in the logs instead of doing a text search.
| // Fetch execution block data | ||
| ei, err := beacon.FetchCombinedBlockData(httpc, infuraRPC, beaconBase, nextBN) | ||
| if err != nil || ei == nil { | ||
| log.Printf("⏳ [BLOCK] Block %d not available yet: %v", nextBN, err) |
There was a problem hiding this comment.
Although I dont mind all these characters, not sure if anyone will be looking at this if things are running correctly. In which case you will only be looking at them when you have errors. Easier to parse the logs with lesser characters.
|
|
||
| case <-backfillTicker.C: | ||
| log.Printf("[BACKFILL] Starting backfill operations...") | ||
| backfill.RunAll(ctx, db, httpc, createOptionsFromCLI(c), relays) |
There was a problem hiding this comment.
Why is ticker needed for backfill? This operation should only happen once. Also it can run in parallel with the main go routine so it should just be started before we enter this loop.
| httputil "github.com/primev/mev-commit/indexer/pkg/http" | ||
| ) | ||
|
|
||
| const optInABIJSON = `[ |
There was a problem hiding this comment.
I gave this comment in the last PR as well. We can use the generated client in contracts-abi. All it needs is ethereum client and it will handle all the data types.
| }, | ||
| } | ||
| } | ||
| func FetchJSONWithRetry(ctx context.Context, httpc *http.Client, url string, out any, attempts int, baseDelay time.Duration) error { |
There was a problem hiding this comment.
If you want to provide these type of utility functions we cannot just assume this is doing the right thing. Unit tests should be added for this pkg. If not we can use some library that provides this functionality.
|
|
||
| } | ||
| func (db *DB) Close() { | ||
| db.Conn.Close() |
There was a problem hiding this comment.
This doesnt return error?
Also if its the job of the DB to close this, we should not be exposing the conn directly outside.
|
@rose2221 This PR seems to have lot of unrelated changes. Is it because of rebase? if so, please fix it |
99823e3 to
1b503b2
Compare
7115450 to
98e511e
Compare
| infuraRPC := c.String(optionInfuraRPC.Name) | ||
| beaconBase := c.String(optionBeaconBase.Name) | ||
| // Initialize random seed | ||
| rand.Seed(time.Now().UnixNano()) |
| "validator_delay", c.Duration("validator-delay")) | ||
|
|
||
| // Setup graceful shutdown | ||
| ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM) |
There was a problem hiding this comment.
I think we spoke about this. This is already being done in main.go. This is not needed.
|
|
||
| latestBlock, err := ethereum.GetLatestBlockNumber(httpc.HTTPClient, infuraRPC) | ||
| if err != nil { | ||
| initLogger.Error("failed to get latest block from RPC", "error", err) |
There was a problem hiding this comment.
What if there is an error here? lastBN will be initialized with -10
| "lookback", c.Int("backfill-lookback"), | ||
| "batch", c.Int("backfill-batch")) | ||
| go backfill.RunAll(ctx, db, httpc, createOptionsFromCLI(c), relays) | ||
| initLogger.Info("[BACKFILL] completed startup backfill") |
There was a problem hiding this comment.
As we spoke this log message is wrong. This does not mean the backfill was completed.
| // Log block details | ||
| initLogger.Info("[BLOCK] processing block", "block", nextBN, "slot", ei.Slot) | ||
|
|
||
| if ei.Timestamp != nil { |
There was a problem hiding this comment.
All these log messages are spammy. We should have just 1 log message which prints all the fields.
| TsMS *int64 | ||
| } | ||
|
|
||
| func MustConnect(ctx context.Context, dsn string, maxConns, minConns int) (*DB, error) { |
There was a problem hiding this comment.
Usually the MustXXX terminology is used when function panics if there is error. If you are returning error this can just be called Connect or NewDB.
| initLogger.Warn("[BLOCK] not available yet", "block", nextBN, "error", err) | ||
| continue | ||
| } | ||
| fields := []any{ |
There was a problem hiding this comment.
Not sure why all this is required.
The ideal solution to the comment should just be one log message.
log.Info("processing block", "key1", value1, "key2", value2)
| } | ||
|
|
||
| // final flush | ||
| if len(batch) > 0 { |
There was a problem hiding this comment.
If main context is cancelled, this insert will be called with cancelled context.
| return db, nil | ||
| } | ||
|
|
||
| func closeDatabase(db *database.DB, logger *slog.Logger) { |
There was a problem hiding this comment.
This just feels like its blindly broken up into functions for the sake of refactor. For eg. this function is not needed.
| } | ||
| } | ||
|
|
||
| func handleShutdown(ctx context.Context, db *database.DB, lastBN int64, logger *slog.Logger) error { |
There was a problem hiding this comment.
This function and saveBlockProgress are doing the exact same thing.
| initLogger.Info("indexer configuration", "lookback", c.Int("backfill-lookback"), "batch", c.Int("backfill-batch")) | ||
|
|
||
| // Run backfill if configured | ||
| runBackfillIfConfigured(ctx, c, db, httpc, relays, initLogger) |
There was a problem hiding this comment.
This is happening in a single routine. The main loop will not start till the backfill is finished. This is not what we were planning.
| func initializeDatabase(ctx context.Context, dbURL string, logger *slog.Logger) (*database.DB, error) { | ||
| db, err := database.Connect(ctx, dbURL, 20, 5) | ||
| if err != nil { | ||
| logger.Error("[DB] connection failed", "error", err) |
There was a problem hiding this comment.
Not really sure why the log messages need [SOME KEY]. It can just say DB connection failed. What is achieved with this square brackets? Both cases you will have to search for DB.
|
|
||
| logger.Info("[RELAY] loaded active relays", "count", len(relays)) | ||
| for _, r := range relays { | ||
| logger.Info("[RELAY] relay found", "id", r.ID, "url", r.URL) |
There was a problem hiding this comment.
Printing 'Relay' twice in the log message makes no sense.
| logger.Info("[DB] block saved successfully", "block", nextBN) | ||
|
|
||
| processBidsForBlock(ctx, db, httpc, relays, ei, logger) | ||
| launchAsyncValidatorTasks(ctx, c, db, httpc, ei, beaconBase, logger) |
There was a problem hiding this comment.
We launch async tasks and save block progress? What if there was an error in the async task?
|
|
||
| func launchAsyncValidatorTasks(ctx context.Context, c *cli.Context, db *database.DB, httpc *retryablehttp.Client, ei *beacon.ExecInfo, beaconBase string, logger *slog.Logger) { // Async validator pubkey fetch | ||
| if ei.ProposerIdx != nil { | ||
| go func(slot int64, proposerIdx int64) { |
There was a problem hiding this comment.
So the first go routine is supposed to populate the Validator public key which the second routine is using? What is the point of doing it in separate go routines then? The second one will fail till first one completes. Not sure if I understand this logic.
|
|
||
| } | ||
|
|
||
| func saveBlockProgress(db *database.DB, blockNum int64, logger *slog.Logger) { |
There was a problem hiding this comment.
So out of the two functions, you removed 1. Now this function is still here. As I mentioned before, just changing code for the sake of it is not the right way. You need to understand first if you even need the function. If you create smaller functions for everything, the readability is hampered. There is a balance.
| var slotsForBids []int64 | ||
| var validatorsToCheck []SlotData | ||
|
|
||
| for data := range slotChan { |
There was a problem hiding this comment.
Better way is to have the slotChan be a channel of some type
type result struct {
data SlotData
err Error
}
This way you need only 1 for loop and you can check
if data.err != nil {
// handle the error
}
| logger.Error("opt-in status update failed", "slot", v.Slot, "error", uerr) | ||
| } | ||
| } | ||
| if err := recentBids(ctx, db, httpc, relays, slotsForBids); err != nil { |
There was a problem hiding this comment.
So again, I am not sure you understood what I meant when I asked you to use channels. The idea was that we do it one by one, so get missing slot, then get the validator info and opted in status for that slot and then get the bids for that slot, then move onto the next slot.
You have just changed this validator check to be sequential but then the bids again use batches. This does not make any sense. Either you do one by one or do batch like you were doing before.
| } | ||
| } | ||
|
|
||
| for _, r := range relays { |
There was a problem hiding this comment.
This for loop should be inside the above for loop no?
|
|
||
| } | ||
|
|
||
| func launchAsyncValidatorTasks(ctx context.Context, c *cli.Context, db *database.DB, httpc *retryablehttp.Client, ei *beacon.ExecInfo, beaconBase string, logger *slog.Logger) error { // Async validator pubkey fetch |
There was a problem hiding this comment.
This is no longer async. Naming should be changed.
| } | ||
| return v.Elem().Interface() | ||
| } | ||
| func processNextBlock(ctx context.Context, c *cli.Context, db *database.DB, httpc *retryablehttp.Client, relays []relay.Row, infuraRPC, beaconBase string, lastBN int64, logger *slog.Logger) int64 { |
There was a problem hiding this comment.
So these 3 functions are duplicates.
processNextBlock
processBidsForBlock
launchValidatorTask
The backfill does these exact 3 things. Only difference is that in the forward operation, the block number is obtained by querying the current height and in cast of backfill it is obtained by querying DB for missing slots.
So ideally you could have 1 file with these 3 functions:
ProcessBlockData(BlockNumber)
ProcessRelayBidData(BlockNumber)
ProcessValidatorOptInData(BlockNumber, validators)
Both backfill and forward operations can just use these functions. Only the logic to get block number will be different.
feat(indexer): builder-observer data ingestor
Adds comprehensive blockchain indexing with MEV relay bid data and BeaconChain metadata for builder observer functionality.
Local Setup (StarRocks in Docker)
Schema DDL (Updated for Builder Observer)
Optional:
Export and Run