Skip to content

[Bug] Oracle Database Type (dbtype) Metric Reporting Incorrect Values #278

Open
@kizuna-lek

Description

@kizuna-lek

Description

The Oracle database type (dbtype) metric may report incorrect values due to a timing issue in the connection and ping logic. The dbtype is currently retrieved during the connect() function, which is called during initialization and after ping failures. However, if Oracle hasn't fully recovered when connect() is called after a ping failure, the function may retrieve an incorrect dbtype value (e.g., default value 0), which then persists even after the database connection becomes healthy.

case:
exporter shows oracledb_dbtype as 0

Image

but sqlplus shows as 1

Image

Environment

  • File: collector/database.go
  • Functions affected: connect(), ping(), NewDatabase()

Expected Behavior

The dbtype should be retrieved and updated when the database connection is confirmed to be healthy (i.e., after a successful ping), not during the connection establishment phase when the database might still be recovering.

Code Analysis

Current Implementation

func (d *Database) ping(logger *slog.Logger) error {
	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
	defer cancel()
	err := d.Session.PingContext(ctx)
	if err != nil {
		d.Up = 0
		if strings.Contains(err.Error(), "sql: database is closed") {
			db, dbtype := connect(logger, d.Name, d.Config)
			d.Session = db
			d.Type = dbtype  // dbtype set here, potentially with wrong value
		}
	} else {
		d.Up = 1  // Connection is healthy, but dbtype is not updated
	}
	return err
}
func connect(logger *slog.Logger, dbname string, dbconfig DatabaseConfig) (*sql.DB, float64) {
	// ... connection setup code ...
	
	var result int
	if err := db.QueryRowContext(ctx, "select sys_context('USERENV', 'CON_ID') from dual").Scan(&result); err != nil {
		logger.Info("dbtype err", "error", err, "database", dbname)
	}
	
	return db, float64(result)  // May return 0 if query fails
}

Suggested Solution

Move the dbtype retrieval logic from the connect() function to be executed after successful ping operations. This ensures that the database type is only queried and updated when the database is confirmed to be healthy and responsive.

Note

This is my personal analysis and opinion based on reviewing the codebase. If you think this assessment is correct and the proposed solution is reasonable, I would be happy to submit a pull request to implement the necessary changes. I appreciate your time in reviewing this issue and welcome any feedback or alternative approaches you might suggest.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions