Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple schema fetch queries timeouts from server side timeouts #361

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

dkropachev
Copy link
Collaborator

@dkropachev dkropachev commented Aug 8, 2024

Scylla have an ability to override server timeout by appending USING TIMEOUT <timeout>ms to the query
Make driver add USING TIMEOUT <control_connection_timeout*1000>ms for scylla connections

Closes #320

@dkropachev dkropachev force-pushed the dk/320-python-implementation-of-decouple-schema-fetch-queries-from-server-side-timeouts branch 5 times, most recently from d94edcc to 31d490a Compare August 9, 2024 00:06
@dkropachev dkropachev marked this pull request as ready for review August 9, 2024 01:03
@dkropachev dkropachev self-assigned this Aug 9, 2024
@dkropachev dkropachev force-pushed the dk/320-python-implementation-of-decouple-schema-fetch-queries-from-server-side-timeouts branch from 31d490a to ea6ed21 Compare August 11, 2024 07:29
@dkropachev dkropachev requested a review from fruch August 16, 2024 02:15
Copy link

@fruch fruch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test classes need to have Test in thier name

@dkropachev dkropachev force-pushed the dk/320-python-implementation-of-decouple-schema-fetch-queries-from-server-side-timeouts branch from ea6ed21 to d1f38d8 Compare August 16, 2024 12:33
@dkropachev dkropachev requested a review from fruch August 16, 2024 12:34
Comment on lines 1158 to 1160
shard_aware_options=None,
metadata_request_timeout=datetime.timedelta(seconds=2),
):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be less confusing if the default is in one place (definition of the field) and here it is just None)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 99 to 108
class SchemaQueryMessage(QueryMessage):
def __init__(self, query, consistency_level, serial_consistency_level=None,
fetch_size=None, paging_state=None, timestamp=None, continuous_paging_options=None, keyspace=None,
custom_timeout=None):
super(SchemaQueryMessage, self).__init__(
add_timeout_to_query(query, custom_timeout), consistency_level,
serial_consistency_level=serial_consistency_level, fetch_size=fetch_size, paging_state=paging_state,
timestamp=timestamp, continuous_paging_options=continuous_paging_options, keyspace=keyspace)


Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why super(SchemaQueryMessage, self).__init__ instead of super().__init__? I tried to research it quickly, and internet says that in Python 3 those are equal.
  2. Parameter name should be something like server_side_timeout - right now it is not clear which timeout is it for someone reading the code.
  3. A comment explaining what this class is would be appreciated. It's not obvious to me why you are introducing new class instead of adding parameter to QueryMessage contructor.
  4. Also consider introducing it in a separate commit. I'd like to see this PR split into commits because it is not easy to digest as one commit. Possible way to split:
  • Introduce SchemaQueryMessage, explaining its purpose, and change usages of QueryMessage to SchemaQueryMessage (without using custom_timeout parameter for now)
  • Add timeout parameter to I think Metadata.refresh - or whatever is the place from which it is taken to be passed to SchemaQueryMessage
  • Add the plumbing in Cluster - this is the moment that the new feature is actually used.
  • Add tests

I'll wait with further review until the PR is split into commits to be more digestable.

Copy link
Collaborator Author

@dkropachev dkropachev Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lorak-mmk , I have dropped SchemaQueryMessage, and split PR into commits, please take a look

@roydahan
Copy link
Collaborator

@dkropachev what's the status of this PR?

@dkropachev
Copy link
Collaborator Author

@dkropachev what's the status of this PR?

I have postponed working on this PR due to the higher priority tasks on my side and @Lorak-mmk being focused on rust driver.
This week I will address comments.

@roydahan
Copy link
Collaborator

@dkropachev let's try to merge this by the end of quarter if possible.

@dkropachev dkropachev force-pushed the dk/320-python-implementation-of-decouple-schema-fetch-queries-from-server-side-timeouts branch 5 times, most recently from d40a956 to cbfc3fb Compare September 29, 2024 22:26
@dkropachev
Copy link
Collaborator Author

aarch64 tests failing on test_multi_timer_validation, this test is known to be flaky on this architecture.

Comment on lines 2059 to 2063
def __init__(self, connection, timeout, fetch_size, metadata_request_timeout=None):
super(SchemaParserV22, self).__init__(connection, timeout, fetch_size, metadata_request_timeout)
self.keyspaces_result = []
self.tables_result = []

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SchemaParser* structs are not part of public interface I think, se we should be able to add new mandatory argument to __init__. Wouldn't it be better for metadata_request_timeout to be such an argument, like timeout is? There is less room for mistakes on the caller side with mandatory arguments, and you are always passing the argument anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 1808 to 1813

def add_timeout_to_query(stmt: str, metadata_request_timeout: datetime.timedelta) -> str:
if metadata_request_timeout is None:
return stmt
ms = int(metadata_request_timeout / datetime.timedelta(milliseconds=1))
if ms == 0 or _query_timeout_regexp.search(stmt) is not None:
return stmt
return f"{stmt} USING TIMEOUT {ms}ms"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if metadata_request_timeout can be None then the type should be Optional[datetime.timedelta].

I really wish we could sit down and add types to the whole driver and then run pyright (or other type checker) in CI.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 1808 to 1810

def add_timeout_to_query(stmt: str, metadata_request_timeout: datetime.timedelta) -> str:
if metadata_request_timeout is None:
return stmt
ms = int(metadata_request_timeout / datetime.timedelta(milliseconds=1))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better name for this function would be maybe_add_timeout_to_query as it doesn't always add the timeout.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 1804 to 1807


_query_timeout_regexp = re.compile('USING[ \t]+TIMEOUT[ \t]+[0-9]+[a-z]+[ \t]*;?$')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We control all the places where the add_timeout_to_query function is called, there should be no need to have such a regex and performs checks with it.

If this is supposed to be a more universal function (and thus requiring this regex) then there are some problems with it (some of them unsolvable):

  • For INSERT there may be no literal USING TIMEOUT. INSERT INTO ks.t2 (pk, ck) VALUES (1, 2) USING TIMESTAMP 12345678 AND TIMEOUT 360ms; is a valid statement. I don't think it's fixable using regular expressions.
  • As soon as some new element of SELECT is added (and USING TIMEOUT doesn't have to be at the end of the statement) similar problem as above happens for SELECTs.
  • Time unit can use uppercase letters, so [a-z] is wrong.
  • USING TIMEOUT is case insensitive.

Copy link
Collaborator Author

@dkropachev dkropachev Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, at the worst we will see that query fails on the integration and unit tests, regexp removed.

This option allows user to control timeout for driver internal queries.
Idea is to make driver queries more resilient and being independent of
user queries.
@dkropachev dkropachev force-pushed the dk/320-python-implementation-of-decouple-schema-fetch-queries-from-server-side-timeouts branch 2 times, most recently from ed356aa to 5b0a63f Compare October 3, 2024 11:25
@dkropachev dkropachev force-pushed the dk/320-python-implementation-of-decouple-schema-fetch-queries-from-server-side-timeouts branch from 5b0a63f to 7a4ae44 Compare October 3, 2024 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python implementation of "Decouple schema fetch queries from server-side timeouts"
4 participants