Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: improve index constraints for boolean columns #19106

Merged

Conversation

nvanbenschoten
Copy link
Member

@nvanbenschoten nvanbenschoten commented Oct 7, 2017

This PR includes 4 small changes that together greatly improve the
index constraints, and by extension, the index spans that we generate
when dealing with BOOL columns:

  1. We now create index constraints for bool IndexedVars in WHERE
    clauses.
  2. We now normalize NOT NOT b to b, which allows us to improve
    index selection when that pattern is found in WHERE clauses.
  3. We now normalize IS and IS NOT expressions to EQ and NE
    when possible. This allows index selection rules that work only for EQ
    and NE to construct index constraints. This also simplifies things for sql: NaN = NaN (in Postgres) #18860.
  4. We now simplify v != max to v < max and v != min to v > min, which
    allows us to create index constraints on these expressions.

None of the following queries were able to efficiently use indexes before.
Now they all can:

SELECT * FROM t WHERE b -- commit 1
SELECT * FROM t WHERE NOT NOT b -- commit 2
SELECT * FROM t WHERE b IS TRUE -- commit 3
SELECT * FROM t WHERE b IS NOT TRUE -- commit 3/4
SELECT * FROM t WHERE b IS FALSE -- commit 3
SELECT * FROM t WHERE b IS NOT FALSE -- commit 3/4
SELECT * FROM t WHERE b != true -- commit 4
SELECT * FROM t WHERE b != false -- commit 4

@nvanbenschoten nvanbenschoten requested review from knz, RaduBerinde and a team October 7, 2017 05:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde
Copy link
Member

:lgtm: Nice improvements!


Review status: 0 of 6 files reviewed at latest revision, 4 unresolved discussions.


pkg/sql/analyze.go, line 133 at r1 (raw file):

//   (a < 1 AND a > 2)  -> false
//   (a > 1 OR a < 2)   -> true
//   (a > 1 OR func(b)) -> true

[nit] perhaps add the new example here (a) -> (a = true)


pkg/sql/analyze.go, line 1483 at r1 (raw file):

}

func simplifyBoolVar(evalCtx *parser.EvalContext, n *parser.IndexedVar) (parser.TypedExpr, bool) {

needs a comment. It's a little unintuitive what it does given its name, arguably "a" is simpler than "a = true"


pkg/sql/parser/normalize.go, line 450 at r3 (raw file):

			return NewTypedAndExpr(
				NewTypedComparisonExpr(EQ, expr.TypedLeft(), expr.TypedRight()),
				NewTypedComparisonExpr(IsNot, expr.TypedLeft(), DNull),

Is this needed? Would a NULL ever satisfy an equality?


pkg/sql/parser/normalize.go, line 457 at r3 (raw file):

			return NewTypedOrExpr(
				NewTypedComparisonExpr(NE, expr.TypedLeft(), expr.TypedRight()),
				NewTypedComparisonExpr(Is, expr.TypedLeft(), DNull),

Is this needed? isn't NE always true if one of the operands is NULL?


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Oct 7, 2017

:lgtm: no more comments than radu


Reviewed 4 of 4 files at r1, 4 of 4 files at r2, 4 of 4 files at r3, 3 of 3 files at r4.
Review status: all files reviewed at latest revision, 4 unresolved discussions.


Comments from Reviewable

Index selection only creates index constraints when it sees
`ComparisonExprs` that it knows how to handle. This means that if it saw
a query like `SELECT * FROM t WHERE b`, where `b` is a `BOOL` columns,
it would previously end up scanning the entire table even if there was
an index on `b`! However, `SELECT * from t WHERE b = true` would work
correctly.

This change fixes this by simplifying boolean `IndexedVars` into
`ComparisonExprs`.
This change adds normalization and simplification rules for `NotExprs`:
```
NotExpr{NotExpr{e}} -> e
```

We technically only need to add this as a normalization rule, but
until cockroachdb#2791 is addressed, I think it's better to add both to preserve
symmetry with `ComparisonExpr`, `AndExpr`, and `OrExpr`.

These rules allow us to select better index spans in certain situations.
This change adds new normalization rules:
```
ComparisonExpr{Is,v,d} where d != DNull ->
  AndExpr{
    ComparisonExpr{EQ,v,d}
    ComparisonExpr{IsNot,v,DNull}
  }
ComparisonExpr{IsNot,v,d} where d != DNull ->
  OrExpr{
    ComparisonExpr{NE,v,d}
    ComparisonExpr{Is,v,DNull}
  }
```

This allows us to take advantage of obvious index constraints
for queries like `SELECT * FROM t WHERE b IS TRUE`. It will
also avoid complications being discussed in cockroachdb#18860 by addressing
the root issue.

There may be an argument to make this a simplification rule instead.
This change adds two new simplification rules:
```
ComparisonExpr{NE,v,d} where d.Max() -> ComparisonExpr{LT,v,d}
ComparisonExpr{NE,v,d} where d.Min() -> ComparisonExpr{GT,v,d}
```

These inequalities can be more easily used for index selection. For
datum types with large domains, this isn't particularly useful. However,
for types like BOOL, this is important because it means we can use an
index constraint for queries like `SELECT * FROM t WHERE b != true`.
@nvanbenschoten
Copy link
Member Author

Review status: 1 of 6 files reviewed at latest revision, 4 unresolved discussions.


pkg/sql/analyze.go, line 133 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] perhaps add the new example here (a) -> (a = true)

Done.


pkg/sql/analyze.go, line 1483 at r1 (raw file):

Previously, RaduBerinde wrote…

needs a comment. It's a little unintuitive what it does given its name, arguably "a" is simpler than "a = true"

Done.


pkg/sql/parser/normalize.go, line 450 at r3 (raw file):

Previously, RaduBerinde wrote…

Is this needed? Would a NULL ever satisfy an equality?

Added a comment. The gist is that EQ exprs don't handle NULL values the same way that IS exprs do.


pkg/sql/parser/normalize.go, line 457 at r3 (raw file):

Previously, RaduBerinde wrote…

Is this needed? isn't NE always true if one of the operands is NULL?

Same as above. Also, got rid of isVar(expr.Left) && , which wasn't meant to be there.


Comments from Reviewable

This change adds simplification rules to `simplifyNotExpr`:
```
NotExpr{ComparisonExpr{NotRegMatch,a,b}}       -> ComparisonExpr{RegMatch,a,b}
NotExpr{ComparisonExpr{NotRegIMatch,a,b}}      -> ComparisonExpr{RegIMatch,a,b}
NotExpr{ComparisonExpr{IsDistinctFrom,a,b}}    -> ComparisonExpr{IsNotDistinctFrom,a,b}
NotExpr{ComparisonExpr{IsNotDistinctFrom,a,b}} -> ComparisonExpr{IsDistinctFrom,a,b}
NotExpr{ComparisonExpr{Is,a,b}}                -> ComparisonExpr{IsNot,a,b}
NotExpr{ComparisonExpr{IsNota,b}}              -> ComparisonExpr{Is,a,b}
```
@nvanbenschoten nvanbenschoten merged commit bf15aab into cockroachdb:master Oct 8, 2017
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/notSpans branch October 8, 2017 01:13
@RaduBerinde
Copy link
Member

:lgtm:


Comments from Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants