Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate auth_test to common #14041

Closed
wants to merge 1 commit into from
Closed

Conversation

chaochn47
Copy link
Member

@chaochn47 chaochn47 commented May 13, 2022

Related to #13637


@chaochn47
Copy link
Member Author

@serathius Can I get some early feebacks if the common test framework auth implementation makes sense to you? Thanks!

@codecov-commenter
Copy link

codecov-commenter commented May 13, 2022

Codecov Report

Merging #14041 (d5f2696) into main (9cc2f64) will decrease coverage by 1.15%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main   #14041      +/-   ##
==========================================
- Coverage   75.80%   74.65%   -1.16%     
==========================================
  Files         457      457              
  Lines       37421    37391      -30     
==========================================
- Hits        28367    27914     -453     
- Misses       7294     7747     +453     
+ Partials     1760     1730      -30     
Flag Coverage Δ
all 74.65% <ø> (-1.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
etcdctl/ctlv3/command/txn_command.go 6.14% <0.00%> (-57.02%) ⬇️
etcdctl/ctlv3/command/defrag_command.go 26.92% <0.00%> (-50.00%) ⬇️
server/proxy/grpcproxy/auth.go 44.44% <0.00%> (-44.45%) ⬇️
etcdctl/ctlv3/command/printer_simple.go 19.13% <0.00%> (-33.96%) ⬇️
server/etcdserver/api/v3rpc/auth.go 55.43% <0.00%> (-26.09%) ⬇️
etcdctl/ctlv3/command/ep_command.go 26.11% <0.00%> (-25.56%) ⬇️
server/etcdserver/apply/apply_auth.go 71.00% <0.00%> (-17.00%) ⬇️
etcdctl/ctlv3/command/auth_command.go 54.23% <0.00%> (-16.95%) ⬇️
etcdctl/ctlv3/command/role_command.go 61.29% <0.00%> (-9.68%) ⬇️
server/etcdserver/apply/apply.go 73.39% <0.00%> (-9.18%) ⬇️
... and 30 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ahrtr
Copy link
Member

ahrtr commented May 14, 2022

cc @mitake

@@ -179,6 +179,8 @@ type Cluster struct {

mu sync.Mutex
clusterClient *clientv3.Client

clusterAuthClients []*clientv3.Client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a lot of different types of clients stored just to call .Close(), maybe we should just store list destructor functions? Even better would be to use t.Cleanup(func()) for call client.Close.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should just store list destructor functions

I think that's a good idea. Maybe we could optimize it in a follow-up PR?

Copy link
Member

@serathius serathius May 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean this PR could just avoid introducing clusterAuthClients []*clientv3.Client, by using cleanup []func(). We can migrate others in future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay. Make sense.

@chaochn47 chaochn47 marked this pull request as ready for review August 4, 2022 06:31
@chaochn47 chaochn47 changed the title [WIP] migrate auth_test to common: initial commit migrate auth_test to common Aug 4, 2022
@chaochn47 chaochn47 force-pushed the auth_test branch 2 times, most recently from a174737 to 394540f Compare August 4, 2022 07:23
@chaochn47
Copy link
Member Author

chaochn47 commented Aug 4, 2022

The failed test is unrelated to the PR.

--- FAIL: TestDowngradeUpgradeClusterOf3 (41.84s)

{
	"level": "warn",
	"ts": "2022-08-04T07:33:45.478Z",
	"caller": "embed/serve.go:120",
	"msg": "stopping grpc server due to error",
	"error": "accept tcp 127.0.0.1:20000: use of closed network connection"
}

ref. https://github.com/etcd-io/etcd/runs/7667198807?check_suite_focus=true

The symptom looks like similar to #14275.

However, I think this type of flake is inevitable without embed server restart retry given we run multiple tests in parallel. Kernel and tests races on TCP connection closing, opening.

See this test causes 41.84s while waiting for client connecting to the dead server. Retry server should prevent this type of flake. Do you have any thoughts on this? @serathius

@ahrtr
Copy link
Member

ahrtr commented Aug 8, 2022

Thanks @chaochn47 for the PR. More than 2K line of code change is too big, could you try to break down this PR into small ones?

server/auth/store.go Outdated Show resolved Hide resolved
tests/common/alarm_test.go Outdated Show resolved Hide resolved
@ahrtr
Copy link
Member

ahrtr commented Sep 5, 2022

Sorry for the late response. Are you still working on this PR?

@chaochn47
Copy link
Member Author

Thanks @ahrtr for the reminder

Yeah, I am working on the second common test framework changes mentioned above. #14331 but a little confused how to address the last comment raised by Merek. Will keep the PR updated.

@chaochn47
Copy link
Member Author

The latest code is on top of #14626, once it is merged, rebase will make the code diff smaller.

@chaochn47
Copy link
Member Author

Impacted by a flaky test case

=== FAIL: integration TestPeriodicCheckDetectsCorruption
    corrupt_test.go:100: 
        	Error Trace:	corrupt_test.go:100
        	Error:      	Not equal: 
        	            	expected: []*etcdserverpb.AlarmMember{(*etcdserverpb.AlarmMember)(0xc00bb8f9b0)}
        	            	actual  : []*etcdserverpb.AlarmMember{(*etcdserverpb.AlarmMember)(0xc00bb8f950)}
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -2,3 +2,3 @@
        	            	  (*etcdserverpb.AlarmMember)({
        	            	-  MemberID: (uint64) 12970263741109204454,
        	            	+  MemberID: (uint64) 11662817823488512165,
        	            	   Alarm: (etcdserverpb.AlarmType) 2,
        	Test:       	TestPeriodicCheckDetectsCorruption

@fuweid
Copy link
Member

fuweid commented Nov 22, 2022

Impacted by a flaky test case


=== FAIL: integration TestPeriodicCheckDetectsCorruption

    corrupt_test.go:100: 

        	Error Trace:	corrupt_test.go:100

        	Error:      	Not equal: 

        	            	expected: []*etcdserverpb.AlarmMember{(*etcdserverpb.AlarmMember)(0xc00bb8f9b0)}

        	            	actual  : []*etcdserverpb.AlarmMember{(*etcdserverpb.AlarmMember)(0xc00bb8f950)}

        	            	

        	            	Diff:

        	            	--- Expected

        	            	+++ Actual

        	            	@@ -2,3 +2,3 @@

        	            	  (*etcdserverpb.AlarmMember)({

        	            	-  MemberID: (uint64) 12970263741109204454,

        	            	+  MemberID: (uint64) 11662817823488512165,

        	            	   Alarm: (etcdserverpb.AlarmType) 2,

        	Test:       	TestPeriodicCheckDetectsCorruption

Found that root cause and will file pr later.

Signed-off-by: Chao Chen <chaochn@amazon.com>
@ahrtr
Copy link
Member

ahrtr commented Nov 23, 2022

This PR is too big. So it is almost impossible to carefully check each line of code change. But overall looks clear and good to me.

Please try to use github.com/stretchr/testify to simplify the code.

Suggestion for similar PR in future: Please try to migrate the cases step by step.

I am OK if other maintainers agree to merge this PR. So defer to @serathius and @spzala to merge or comment this PR.

@ahrtr
Copy link
Member

ahrtr commented Nov 23, 2022

Another point is that please try to reuse the common functions in auth_util.go. If the existing functions do not meet some requirements, then please try to enhance the common functions, so that we can simplify the test cases as much as possible.

}
}

func authCredWriteKeyTest(cx ctlCtx) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cant find this tests. Was it migrated under new name or removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was migrated to TestAuthWriteKey.

assert.Equal(t, 3, len(memberList.Members), "want 3 member but got %d", len(memberList.Members))
id := memberList.Members[0].ID

// 5 seconds is the minimum required amount of time peer is considered active
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where this come from? I don't see it in original authTestMemberRemove code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's not needed.

func TestAuthLeaseTestTimeToLiveExpired(t *testing.T) {
tcs := []struct {
name string
JWTEnabled bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that. Will fix this.

}
}

func TestAuthLeaseGrantLeases(t *testing.T) {
Copy link
Member

@serathius serathius Nov 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, there are couple of tests for leases that do not really tests authorization. They are normal lease testing scenarios with only difference that authorization is enabled.

We should investigate if we can expand standard lease tests and treat authorization as part of cluster setup. This way we will not need to duplicate test scenario (one for lease without auth, second with auth enabled).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think we can move this post the migration to think about it in the long term.

t.Fatalf("want leaseID %v expired but not", leaseID)
}
gresp, err := rootAuthClient.Get(ctx, "key", config.GetOptions{})
if err != nil || len(gresp.Kvs) != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, no need to fix it in this PR. This checking for empty response will really benefit from rewrite to testify.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack. Will fix it.

authEnable(ctx, cc, t)
rootAuthClient := testutils.MustClient(clus.Client(WithAuth("root", "root")))
authSetupDefaultTestUser(ctx, rootAuthClient, t)
if _, err := rootAuthClient.RoleGet(ctx, "test-role"); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original test validated here contents of the role, like what keys it can write and what keys it can read.

	expected := []string{
		"Role test-role",
		"KV Read:", "foo",
		"KV Write:", "foo",
	}

if err != nil {
t.Fatal(err)
}
assert.ElementsMatch(t, resp.Roles, []string{"test-role"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original test also validated user name in response.

if err != nil {
t.Fatal(err)
}
assert.ElementsMatch(t, resp.Roles, []string{"test-role"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed and left couple fo comments. I would highly encourage migrating one tests at a time to make review quick. This one review iteration took me over 30 minutes and we will need more iterations.

If you send a separate PR that just migrates one test, I expect that review would take 2 minutes and I good give more detailed feedback.

@chaochn47
Copy link
Member Author

Thanks for the review, I will split those test cases into multiple incremental PRs.

Each PR contains 2 test cases.

@stale
Copy link

stale bot commented Mar 18, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

5 participants