Description
Description
The Oracle database type (dbtype
) metric may report incorrect values due to a timing issue in the connection and ping logic. The dbtype
is currently retrieved during the connect()
function, which is called during initialization and after ping failures. However, if Oracle hasn't fully recovered when connect()
is called after a ping failure, the function may retrieve an incorrect dbtype
value (e.g., default value 0), which then persists even after the database connection becomes healthy.
case:
exporter shows oracledb_dbtype as 0
but sqlplus shows as 1
Environment
- File:
collector/database.go
- Functions affected:
connect()
,ping()
,NewDatabase()
Expected Behavior
The dbtype
should be retrieved and updated when the database connection is confirmed to be healthy (i.e., after a successful ping), not during the connection establishment phase when the database might still be recovering.
Code Analysis
Current Implementation
func (d *Database) ping(logger *slog.Logger) error {
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
err := d.Session.PingContext(ctx)
if err != nil {
d.Up = 0
if strings.Contains(err.Error(), "sql: database is closed") {
db, dbtype := connect(logger, d.Name, d.Config)
d.Session = db
d.Type = dbtype // dbtype set here, potentially with wrong value
}
} else {
d.Up = 1 // Connection is healthy, but dbtype is not updated
}
return err
}
func connect(logger *slog.Logger, dbname string, dbconfig DatabaseConfig) (*sql.DB, float64) {
// ... connection setup code ...
var result int
if err := db.QueryRowContext(ctx, "select sys_context('USERENV', 'CON_ID') from dual").Scan(&result); err != nil {
logger.Info("dbtype err", "error", err, "database", dbname)
}
return db, float64(result) // May return 0 if query fails
}
Suggested Solution
Move the dbtype
retrieval logic from the connect()
function to be executed after successful ping operations. This ensures that the database type is only queried and updated when the database is confirmed to be healthy and responsive.
Note
This is my personal analysis and opinion based on reviewing the codebase. If you think this assessment is correct and the proposed solution is reasonable, I would be happy to submit a pull request to implement the necessary changes. I appreciate your time in reviewing this issue and welcome any feedback or alternative approaches you might suggest.