Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SqlNotificationInfo.None enum value missing, causing unnecessary exceptions. #1376

Closed
ascott18 opened this issue Oct 29, 2021 · 18 comments
Closed

Comments

@ascott18
Copy link

ascott18 commented Oct 29, 2021

Describe the bug

In SqlDependencyListener, when an incoming XML message is parsed, sometimes these messages look like this:

<qn:QueryNotification xmlns:qn="http://schemas.microsoft.com/SQL/Notifications/QueryNotification" id="4579" type="change" source="timeout" info="none" database_id="9" sid="0x01050000000000051500000002AAFCB5FFBA57356D2C83B7E9030000"><qn:Message>52cdaaaf-f5e5-4342-8c10-dfec05cf4950;ec06ef71-ec94-4e56-a11e-c1905ebf4cf9</qn:Message></qn:QueryNotification>

The info attribute is parsed at https://github.com/dotnet/SqlClient/blob/main/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlDependencyListener.cs#L1133-L1167. However, the enum that the value is parsed into does not included a value "None", causing an exception to be thrown. It is ultimately handled by the containing try/catch, but there are a few issues with this:

  • Exception flow and handling incurs a performance cost.
  • The exception is System.ArgumentException and is thrown from mscorlib, which means that suppressing it in Visual Studio's Exception Settings causes all such exceptions to be supressed, making debugging more difficult when there are other AgumentExceptions thrown by mscorlib that I do want to break on.

image

To reproduce

This is a bit weird to reproduce. The first few notifications where will be of Info == Error. But sometimes, it will be "none" in the XML, which then throws and gets turned into Unknown.

Set exception settings in Visual Studio to break on all CLR exceptions.

// See https://aka.ms/new-console-template for more information
using Microsoft.Data.SqlClient;

var connectionString = "Server=localhost;Database=BugReproDb;Trusted_Connection=True;Command Timeout=5";
SqlDependency.Start(connectionString);
var connection = new SqlConnection(connectionString);
connection.Open();
var cmd1 = connection.CreateCommand();
cmd1.CommandText = "IF OBJECT_ID('TestTable') IS NULL CREATE TABLE TestTable (c1 int);";
cmd1.ExecuteNonQuery();
cmd1.Dispose();

Console.WriteLine("Usually the notification here will be 'Change Timeout Error'.");
Console.WriteLine("But sometimes, it will be 'Change Timeout Unknown'. This is the error.");
while (true)
{
    var tcs = new TaskCompletionSource();
    var cmd = connection.CreateCommand();
    cmd.CommandText = "SELECT c1 FROM dbo.TestTable";

    var dependency = new SqlDependency(cmd, null, (int)1);
    dependency.OnChange += (o, e) => {
        Console.WriteLine($"{e.Type} {e.Source} {e.Info}");
        if (e.Info == SqlNotificationInfo.Unknown)
        {
            Environment.Exit(0);
        }
        tcs.SetResult();
    };
    cmd.ExecuteNonQuery();
    await tcs.Task;
    cmd.Dispose();
}

Expected behavior

SqlNotificationInfo.None is defined so exception flow isn't needed.

Further technical details

Microsoft.Data.SqlClient version: 3.0.1
.NET target: net6.0
SQL Server version: 15.0.2000.5
Operating system: Win 10

@Wraith2
Copy link
Contributor

Wraith2 commented Oct 30, 2021

Interesting. According to the schema that it claims that it implements the value none is invalid, http://schemas.microsoft.com/SQL/Notifications/QueryNotification/QueryNotification.xsd either the value must be missing or one of the values specified in the restriction which doesn't include none.

We could handle the none case in this library probably by using TryParse and then checking for the literal "none" value but the SQL Server either shouldn't be sending that value of should be using a schema that allows that value and those changes would need to be made at the server side.

@johnnypham
Copy link
Contributor

@Wraith2 where does it state that it implements that schema? There are already some values in the enum not present in the schema.

@Wraith2
Copy link
Contributor

Wraith2 commented Nov 1, 2021

<qn:QueryNotification xmlns:qn="http://schemas.microsoft.com/SQL/Notifications/QueryNotification" id="4579" type="change" source="timeout" info="none" database_id="9" sid="0x01050000000000051500000002AAFCB5FFBA57356D2C83B7E9030000">qn:Message52cdaaaf-f5e5-4342-8c10-dfec05cf4950;ec06ef71-ec94-4e56-a11e-c1905ebf4cf9</qn:Message></qn:QueryNotification>

Right there in the xml the user provided. It uses xmlns to declare a short namespace prefix qn and identifies that it implements the schema at the url. That url exists and has a short and easy to read (a rarity) xsd schema.

@johnnypham
Copy link
Contributor

Oops, guess I'm blind. If none is not a supported value, should we be adding it to the enum? This could be by design and we can just improve it by not throwing that exception but ultimately keeping it as unknown.

@Wraith2
Copy link
Contributor

Wraith2 commented Nov 1, 2021

I think we have to add handling of it somehow but I don't know whether we want to add it to the enum. If we're going to change the enum (which is a breaking change) then it would probably be a good idea to ask the Sql Server team what possible values it can take from their side and make sure everything is covered. It's clearly no longer limited to the originally defined values.

@johnnypham
Copy link
Contributor

Server team has confirmed the server shouldn't be sending that value and this is likely a server-side bug.

@Wraith2
Copy link
Contributor

Wraith2 commented Nov 3, 2021

Excellent news. I think it is still best to handle it in this library without throwing since we can't be certain users will ever upgrade or even be on a supported version. What do you think?

@ascott18
Copy link
Author

ascott18 commented Nov 3, 2021

Yeah, I like the idea of at least changing this to a TryParse rather than a Parse with try/catch.

@johnnypham
Copy link
Contributor

I agree, I'll make a PR for it.

@johnnypham
Copy link
Contributor

@ascott18 I'm unable to reproduce it on 15.0.2080.9. I ran it for several minutes, how long should it take?

@ascott18
Copy link
Author

ascott18 commented Nov 3, 2021

@johnnypham When I originally made the repro it consistently happened on the third notification. I just ran it again and got it on the 4th.

But sometimes when I'm developing the actual codebase where this is used, I can go 30 minutes or more until it happens and VS breaks on that exception.

Edit: Definitely something very strange going on inside of SQL Server here. I added timestamps to the output and got the following:

PS C:\src\repros\sqlnotificationnone> while ($true) { dotnet run }
Usually the notification here will be 'Change Timeout Error'.
But sometimes, it will be 'Change Timeout Unknown'. This is the error.
11/3/2021 5:05:00 PM Change Timeout Error
11/3/2021 5:05:15 PM Change Timeout Error
11/3/2021 5:05:30 PM Change Timeout Error
11/3/2021 5:05:33 PM Change Timeout Unknown
Usually the notification here will be 'Change Timeout Error'.
But sometimes, it will be 'Change Timeout Unknown'. This is the error.
11/3/2021 5:05:50 PM Change Timeout Error
11/3/2021 5:06:05 PM Change Timeout Error
11/3/2021 5:06:20 PM Change Timeout Error
11/3/2021 5:06:33 PM Change Timeout Unknown
Usually the notification here will be 'Change Timeout Error'.
But sometimes, it will be 'Change Timeout Unknown'. This is the error.
11/3/2021 5:06:50 PM Change Timeout Error
11/3/2021 5:07:05 PM Change Timeout Error
11/3/2021 5:07:20 PM Change Timeout Error
11/3/2021 5:07:33 PM Change Timeout Unknown
Usually the notification here will be 'Change Timeout Error'.
But sometimes, it will be 'Change Timeout Unknown'. This is the error.
11/3/2021 5:07:50 PM Change Timeout Error
11/3/2021 5:08:05 PM Change Timeout Error
11/3/2021 5:08:20 PM Change Timeout Error
11/3/2021 5:08:33 PM Change Timeout Unknown

The "none" always happens at 33 seconds after the minute. And the regular timeouts (a) are locked to 5 second increments (I guess due to some internal timer in SQL Server?), but also (b) are timing out after 15 seconds, despite my configured value of 1 second in the code.

@ascott18
Copy link
Author

ascott18 commented Nov 4, 2021

Had a hunch and did a little more checking. As I thought, the timing of the "none" comes from when SQL Server started up. The date printed post startup here is 28 seconds, when is the second stamp of all the "none"s (Unknown).

I wonder if the fact that sometimes I can go a little while without seeing this in debugging is related to the fact that I'm actually doing stuff to the database during debugging, whereas this repro is the singular thing doing anything to this database.

PS> Stop-Service MSSQLSERVER; Get-Date; Start-Service MSSQLSERVER; Get-Date; while ($true) {dotnet run}
WARNING: Waiting for service 'SQL Server (MSSQLSERVER) (MSSQLSERVER)' to stop...
WARNING: Waiting for service 'SQL Server (MSSQLSERVER) (MSSQLSERVER)' to stop...
WARNING: Waiting for service 'SQL Server (MSSQLSERVER) (MSSQLSERVER)' to stop...
WARNING: Waiting for service 'SQL Server (MSSQLSERVER) (MSSQLSERVER)' to stop...

Wednesday, November 3, 2021 5:16:27 PM
Wednesday, November 3, 2021 5:16:28 PM
Repro starting: 11/3/2021 5:16:29 PM
Usually the notification here will be 'Change Timeout Error'.
But sometimes, it will be 'Change Timeout Unknown'. This is the error.
11/3/2021 5:16:44 PM Change Timeout Error
11/3/2021 5:16:59 PM Change Timeout Error
11/3/2021 5:17:14 PM Change Timeout Error
11/3/2021 5:17:28 PM Change Timeout Unknown
Repro starting: 11/3/2021 5:17:29 PM
Usually the notification here will be 'Change Timeout Error'.
But sometimes, it will be 'Change Timeout Unknown'. This is the error.
11/3/2021 5:17:44 PM Change Timeout Error
11/3/2021 5:17:59 PM Change Timeout Error
11/3/2021 5:18:14 PM Change Timeout Error
11/3/2021 5:18:28 PM Change Timeout Unknown

@johnnypham
Copy link
Contributor

Hmm still unable to repro. I'll try to find someone who can. If I can't, would you mind validating the changes?

@ascott18
Copy link
Author

ascott18 commented Nov 5, 2021

Sure!

@johnnypham
Copy link
Contributor

@ascott18
Copy link
Author

ascott18 commented Nov 9, 2021

Thanks, it works (added a review to the PR).

On an unrelated note, I had to add "TrustServerCertificate=True" to my connection string (connecting to localhost). I'm guessing the defaults around cert trusting are being changed in 4.0?

@Wraith2
Copy link
Contributor

Wraith2 commented Nov 9, 2021

Yes they are.

@johnnypham
Copy link
Contributor

Closing as #1378 has been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants