Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Kuery Syntax #15857

Merged
merged 39 commits into from
Mar 7, 2018
Merged

Update Kuery Syntax #15857

merged 39 commits into from
Mar 7, 2018

Conversation

lukasolson
Copy link
Member

@lukasolson lukasolson commented Jan 4, 2018

Closes #15727.
Relies on #15646.

This PR evolves Kuery into a simplified, more lucene-like language. In a follow up PR we plan on getting rid of the language switcher dropdown and instead add a checkbox that will allow users to opt-in to this syntax, which will feel less like a separate language and more like a set of enhancements to the existing lucene query syntax.

This PR also updates the simple grammar we extracted in #15646 to solve the following:

  • Wildcards in field names
  • "is one of" shorthand (e.g. foo:(bar OR baz OR qux))
  • Don't split on whitespace (require explicit AND/ORs, case insensitive)
  • "exists" query (e.g. foo:*)
  • Improved range query shorthands (e.g. foo <= 1)
  • Wildcard queries

Since this new syntax is simpler and does not support every filter type like old Kuery did, this PR also brings back the filter bar when kuery is selected in the language picker.

See the documentation updates in this PR if you need an overview of the new syntax (and let us know if the documentation is lacking).

@lukasolson lukasolson self-assigned this Jan 4, 2018
@lukasolson lukasolson requested a review from Bargs January 4, 2018 22:52
@lukasolson lukasolson changed the title Kql Update simple Kibana query language syntax Jan 4, 2018
Copy link
Contributor

@Bargs Bargs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really liking the direction the language is going, the syntax felt really natural and powerful.

A couple comments, in addition to the inline comments I made:

Do we want to distinguish between basic multi-word analysis and phrase search? In other words, should we only do a match_phrase query if quotes are used, and a regular match query otherwise? This is something we talked about doing on Zoom.

Could you write up some tests similar to the ones we have for Kuery here https://github.com/elastic/kibana/blob/d379e9a35b63a73e2b9ed30d68e23fdf80ba3af2/src/ui/public/kuery/ast/__tests__/ast.js ? I'd especially like to see some tests around escaping, I found that part of the grammar particularly confusing and I'm still not 100% sure I've got my head around it, I need to give it some more thought. It seems like it shouldn't need to be quite so complex.

}
OrQuery
= head:AndQuery Or tail:OrQuery {
return buildNode('or', [head, tail]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

head/tail seems like sort of weird terminology to me here. I usually think of a head being the first item in an array, and the tail being the remaining (0 or more) items, but in this case it's always a pair, which is why I used left/right before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space
= [\ \t\r\n]+
EscapedQuote
= '\\' quote:('"' / '"') { return quote; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be a typo here, both quotes are the same. Shouldn't one of them be a single quote?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2d86135


FieldListExpression
= field:Literal Space* ':' Space* list:SubList {
function buildNodeWithField(field, list) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to build up this intermediate data structure and then loop through it like this? Couldn't we build the actual nodes in each of the List rules?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have access to the field itself in each of the List rules. I don't like this stuff either, though... It has some code smell. I'll play around with it and see what I can think of. Feel free to chime in if you have any ideas.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@Bargs
Copy link
Contributor

Bargs commented Feb 6, 2018

Do we want to distinguish between basic multi-word analysis and phrase search? In other words, should we only do a match_phrase query if quotes are used, and a regular match query otherwise? This is something we talked about doing on Zoom.

As I've started working on adding wildcard query support, I think we do want to do this. Say we had a full_text field with the value Quick Foxes!

full_text:Qui* Foxes should match
full_text:"Qui* Foxes" should not

The former would get compiled into a bool query with a query_string query for the wildcard term and a match query for Foxes. The latter would generate a single match_phrase query for the entire Qui* Foxes value.

Copy link
Contributor

@Bargs Bargs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed one small typo. Otherwise, pending some tests, this looks good to me.

I do think we should add the distinction between phrase queries and multi word queries for quotes and no quotes, but that can be done in an additional PR. I'd really like to get this one merged as soon as we can so the wildcard work can be built on top of it.

return addMeta(nodeTypes.function.buildNode('not', clause), text(), location());
NotList
= Not list:SubList {
return buildNode('not', query);;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think query is supposed to be list here?

function buildNodeWithField(node) {
if (node.hasOwnProperty('arguments')) {
const args = node.arguments.map(buildNodeWithField);
return { ...list, arguments: args };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a bit more than the old version, though it's still kinda hard to understand. I can't say I have a better idea at the moment though.

I wonder if at some point it will make sense for the <field>:(sub-expression) to have its own node type in the AST, so that the field could be associated with that node rather than the individual is nodes. That would make it easier to know how to re-serialize the AST too. Just thinking out loud, it would probably be a waste of time to mess around with that now since this serves our needs fine at the moment.

return query;
ValueExpression
= value:Literal {
return buildNode('is', '*', value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed this when I was working on wildcards. We should pass null instead of * as the second argument here. * will force an all fields query, null will fall back on the default field set in the index settings, which Beats requires since the number of fields in their index patterns max out the new limit on # of fields in a single query in ES.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@elasticmachine
Copy link
Contributor

💔 Build Failed

@Bargs
Copy link
Contributor

Bargs commented Feb 21, 2018

I'm still planning on adding some tests for new functionality tomorrow, but at least all of the existing tests are passing now.

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@Bargs
Copy link
Contributor

Bargs commented Feb 23, 2018

All done updating tests. This is g2g until we get further review feedback.

@elasticmachine
Copy link
Contributor

💔 Build Failed

Copy link
Contributor

@stacey-gammon stacey-gammon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new language a lot and the code also seems to flow better. My only confusion was with the KQL and Kuery naming, but just take the comments as feedback.

I did skim a bit through some of the ast code, but everything else looks good to me.

Is the effort to combine the filter bar and the query bar being abandoned? I don't think that's a bad thing necessarily - seems like trying to make the language do everything also made it too complex. Just wondering if the long term plan has changed.

Kuery is a new query language built specifically for Kibana. It aims to simplify the search experience in Kibana
and enable the creation of helpful features like auto-complete, seamless migration of saved searches, additional
query types, and more. Kuery is a basic experience today but we're hard at work building these additional features on
top of the foundation Kuery provides.

Kueries are built with functions. Many functions take a field name as their first argument. Extremely common functions have shorthand notations.
If you're familiar with Kibana's old lucene query syntax, you should feel right at home with Kuery. Both languages
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we calling it Kuery or KQL? The PR description makes it sound like we are renaming it KQL, but here it's still referred to as Kuery.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sorry, it's a bit confusing. We've been calling the simple language "KQL" in the code and amongst ourselves to distinguish it from the old complex grammar. It'll still be called "kuery" in the UI when this PR is merged, but the plan is ultimately to remove the language dropdown and just add a checkbox to opt-in to "query language enhancements", which will turn on the simple language hopefully without it seeming like a big jump to a whole different language. At that point we can get rid of the kuery name, it just didn't make sense to us to introduce yet another new language to the language switcher only to take it away soon after.

@@ -53,7 +53,7 @@ describe('build query', function () {
{ match_all: {} },
],
filter: [
toElasticsearchQuery(fromKueryExpression('foo:bar'), indexPattern),
toElasticsearchQuery(fromKqlExpression('extension:jpg'), indexPattern),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there three languages - Kuery, Kql and Lucene? Or are KQL and Kuery the same thing? The PR description makes it seem like we are going to use KQL instead of Kuery but here the language on line 39 is kept as 'kuery' while the function name changed to 'fromKqlExpression'.

filter: [].concat(kueryQuery.filter, kqlQuery.filter, luceneQuery.filter, filterQuery.filter),
should: [].concat(kueryQuery.should, kqlQuery.should, luceneQuery.should, filterQuery.should),
must_not: [].concat(kueryQuery.must_not, kqlQuery.must_not, luceneQuery.must_not, filterQuery.must_not),
must: [].concat(kueryQuery.must, luceneQuery.must, filterQuery.must),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm following correctly, you got rid of the KQL version here because you pushed the logic for determining whether something is KQL or Kuery inside buildQueryFromKuery? Is that right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

export function buildQueryFromKuery(indexPattern, queries = []) {
const queryASTs = queries.map((query) => {
try {
return fromKqlExpression(query.query);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit hard to follow the Kql vs Kuery. Here buildQueryFromKuery really means build query from either kuery or KQL. But sometimes it's specific to only the language noted - fromKqlExpression will throw an error if the query is kuery, not Kql. And fromKqlExpression is inside a folder named kuery.

Just giving my perspective as a new reader to this code. Might be easier if the kql vs kuery differences could all be hidden inside the kuery folder and not exported. Then only one name would need to be used externally.

@Bargs Bargs changed the title Update KQL syntax Update Kuery Syntax Mar 4, 2018
@Bargs
Copy link
Contributor

Bargs commented Mar 4, 2018

@stacey-gammon ok I did some renaming to hopefully make things more clear. I got rid of all references to KQL since we're not actually exposing that to the user. Now I think the code better represents what we're thinking, old kuery became legacyKuery and KQL became kuery. Let us know what you think!

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@stacey-gammon
Copy link
Contributor

awesome, I find this version a lot easier to read, thanks @Bargs!

@Bargs Bargs merged commit e9ff31f into elastic:master Mar 7, 2018
Bargs pushed a commit to Bargs/kibana that referenced this pull request Mar 7, 2018
This PR evolves Kuery into a simplified, more lucene-like language. In a follow up PR we plan on getting rid of the language switcher dropdown and instead add a checkbox that will allow users to opt-in to this syntax, which will feel less like a separate language and more like a set of enhancements to the existing lucene query syntax.

This PR also updates the simple grammar to solve the following:

* Wildcards in field names
* "is one of" shorthand (e.g. foo:(bar OR baz OR qux))
* Don't split on whitespace (require explicit AND/ORs, case insensitive)
* "exists" query (e.g. foo:*)
* Improved range query shorthands (e.g. foo <= 1)
* Wildcard queries

Since this new syntax is simpler and does not support every filter type like old Kuery did, this PR also brings back the filter bar when kuery is selected in the language picker.

See the documentation updates in this PR if you need an overview of the new syntax (and let us know if the documentation is lacking).
Bargs added a commit that referenced this pull request Mar 7, 2018
This PR evolves Kuery into a simplified, more lucene-like language. In a follow up PR we plan on getting rid of the language switcher dropdown and instead add a checkbox that will allow users to opt-in to this syntax, which will feel less like a separate language and more like a set of enhancements to the existing lucene query syntax.

This PR also updates the simple grammar to solve the following:

* Wildcards in field names
* "is one of" shorthand (e.g. foo:(bar OR baz OR qux))
* Don't split on whitespace (require explicit AND/ORs, case insensitive)
* "exists" query (e.g. foo:*)
* Improved range query shorthands (e.g. foo <= 1)
* Wildcard queries

Since this new syntax is simpler and does not support every filter type like old Kuery did, this PR also brings back the filter bar when kuery is selected in the language picker.

See the documentation updates in this PR if you need an overview of the new syntax (and let us know if the documentation is lacking).
@lukasolson lukasolson deleted the kql branch March 27, 2018 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants