Add a basic test: TestVmGroupVolumeIsolation #1368

shaominchen · 2017-06-07T21:28:34Z

Added a new test case: TestVmGroupVolumeIsolation
Refactored existing test cases to use new common utilities
Fixed a cleanup issue in vmgroups_test
Updated common utils LogTestStart & LogTestEnd and all related test cases.

govint · 2017-06-08T03:36:09Z

tests/e2e/basic_test.go

+	s.vm2Name = s.config.DockerHostNames[1]
+	s.volName1 = inputparams.GetUniqueVolumeName(testName)
+	s.volName2 = inputparams.GetUniqueVolumeName(testName)
+	s.containerName = inputparams.GetContainerNameWithTimeStamp(testName)


I was thinking maybe just put this into a dummy_test.go file and let these esx, ... variables be globals there. All test modules in the e2e package would be able to refer those variables without each module doing the above. Any repetition of code anywhere could be avoided.

Personally I don't really have much concern about this. We have already introduced common TestConfig, which I feel is sufficient. Here I introduced these local variables just for convenience because I personally prefer short names locally. Otherwise I would just use s.config.XXX.

govint · 2017-06-08T04:08:13Z

tests/e2e/basic_test.go

+	// Remove Config DB
+	admincli.ConfigRemove(s.esx)
+
+	misc.LogTestEnd(testName, "TestVmGroupVolumeIsolation")


This test seems very similar if not exact to vmgroups_test.go:TestVmGroupVolumeAccessAcrossVmGroups(). If neeeding to enhance then the existing test can be enhanced vs. adding a test here. Plus its best to keep all vmgroup + volume related tests in one place.

I completely agree with you - currently the test cases we defined on the Dashboard have a lot of overlap and might be duplicated. I will take a closer look at the test case you pointed out, and decide what we should do to avoid the duplication.

on the Dashboard have a lot of overlap and might be duplicated.

We have broke down tests into smaller steps so you might see the overlap for some steps with others. As discussed in the past we have to avoid such duplication during the automation by enhancing the existing one.

+1 on @govint's comment.

We should be cautious too before updating/modifying the existing one, if any assertions goes wrong following wont run.

@govint I took a further look into vmgroups_test.TestVmGroupVolumeAccessAcrossVmGroups(). This test is very similar to basic_test.TestVmGroupVolumeIsolation(). However, I think it makes sense to keep both because the former is trying to verify the volume isolation between two user-defined vmgroups, while the latter is trying to verify the volume isolation between _DEFAULT and user-defined vmgroup.

I agree we should keep all vmgroup related tests in one single place. I tried to move my test case into vmgroups_test.go. However, this needs too much refactoring to the existing logic in vmgroups_test.go, because the test assumption is different. In vmgroups_test.go, all tests assume that VM1&VM2 belong a user-defined vmgroup (we shouldn't call it "default" in the comments, which will cause confusion with _DEFAULT vmgroup). The current test case doesn't have this assumption. To avoid significant change to the existing vmgroups_test.go, I will keep this test in basic_test.go for now.

govint · 2017-06-08T04:10:10Z

tests/e2e/vmgroups_test.go

@@ -111,6 +111,9 @@ func (vg *VmGroupTest) TearDownSuite(c *C) {
 	out, err = ssh.InvokeCommand(vg.config.EsxHost, cmd)
 	log.Printf(out)

+	// Remove Config DB
+	adminutils.ConfigRemove(vg.config.EsxHost)
+


This must not be done, its creating a fresh and clean plate for the next test to run and we may potentially wipe out a scenario that may lead to tests failing and bugs getting exposed. In fact, for all the e2e tests create the config DB once and let the tests run thru without removng the config DB (except for those tests that test the config DB itself).

If we decide to do this, we must make sure each test clean up itself properly. For example, if a test create a VM group, it must remove the VM group when test end (not only in successful case, but also in failure case). I don't think current test code did this.

I don't follow this. What's the default customer scenario we are expecting? I may be wrong , but I think we are expecting a typical customer environment will be a clean vDVS installation without initializing local or shared DB. So most of e2e tests should be run in the default setup, i.e. NotConfigured mode.

Multi-tenancy features are experimental. So only VmGroup related tests should take care of ConfigInit and ConfigRemove.

@lipingxue current test code removes all vmgroups in TearDownSuite() which is always called irrespective of pass/fail of the test suite.

A test must remove all artifacts that are created by it. But why remove the config DB?
After removing all test artifacts is the DB sane, how do we know that it is?

If we remove the config DB (which btw isn't a test artifact) then most likely we are removing bugs as well. If the removal of test artifacts leaves the DB is a corrupt/inconsistent state - whether local or clustered - then shouldn't that be discovered in the testing?

We can't run without the config DB - customers are trying it out and hence it must be in all our tests.

@govint Yesterday we (sorry you were not included) discussed this and all agreed that most of our e2e tests should be running in the "default" mode. The "default" mode here means "No Config DB", i.e. a fresh installation without running "Config Init". For now multi-tenancy feature is experimental, so (at least for now) we are expecting most of our customers will be using vDVS in the "default" mode. This assumption will remain true until multi-tenancy is fully/correctly implemented and becomes an official product feature.

Coming back to e2e tests, we are expecting most of the test groups will be run in the "default" mode without DB configured. Only vmgroup related tests will need to run "config init", and thus it is expected these tests should be responsible for cleaning up the DB (i.e. run "config rm") so that other e2e tests will still be running in a clean "default" mode.

Hope this helps. Let me know if you still have question about this.

govint · 2017-06-08T04:12:01Z

tests/utils/misc/misc.go

 func LogTestStart(testGroup string, testName string) {
-	log.Printf("START:%s %s %s", testGroup, testName, curTime())
+	log.Printf("START: %s.%s", testGroup, testName)


Please retain timestamp, when debugging we know exactly when a test started and when it finished and hence correlate other logs (daemon, plugin). It also lets us know which tests are taking time. Hence it was defined this way.

I don't follow this. As I've already explained in the PR description, go log itself will print out timestamp for each log message, for example:

2017/06/05 02:46:12 START: restart_test.TestPluginKill
2017/06/05 02:46:12 Attaching volume [restart_test_volume_1496630768] on VM[10.192.232.148]
2017/06/05 02:46:14 Confirming attached status for volume [restart_test_volume_1496630768]
2017/06/05 02:46:16 Killing vDVS plugin on VM [10.192.232.148]
2017/06/05 02:46:18 Sleep for 2 seconds

Why do we want to append a redundant timestamp?

govint · 2017-06-08T04:12:13Z

tests/utils/misc/misc.go

-
-func curTime() string {
-	return time.Now().Format(time.UnixDate)
+	log.Printf("END: %s.%s", testGroup, testName)


Same comment for the timestamp.

Same as above.

lipingxue · 2017-06-08T05:12:12Z

tests/e2e/basic_test.go

+	s.vm2Name = s.config.DockerHostNames[1]
+	s.volName1 = inputparams.GetUniqueVolumeName(testName)
+	s.volName2 = inputparams.GetUniqueVolumeName(testName)
+	s.containerName = inputparams.GetContainerNameWithTimeStamp(testName)


lipingxue · 2017-06-08T05:13:35Z

tests/e2e/basic_test.go

+	s.vm2Name = s.config.DockerHostNames[1]
+	s.volName1 = inputparams.GetUniqueVolumeName(testName)
+	s.volName2 = inputparams.GetUniqueVolumeName(testName)
+	s.containerName = inputparams.GetContainerNameWithTimeStamp(testName)
 }

 var _ = Suite(&BasicTestSuite{})

 // Test volume lifecycle management on different datastores:
 // VM1 - local VMFS datastore


// VM1 - local VMFS datastore -> VM1 - created on local VMFS datastore

Will clarify this.

lipingxue · 2017-06-08T05:27:11Z

tests/e2e/vmgroups_test.go

@@ -111,6 +111,9 @@ func (vg *VmGroupTest) TearDownSuite(c *C) {
 	out, err = ssh.InvokeCommand(vg.config.EsxHost, cmd)
 	log.Printf(out)

+	// Remove Config DB
+	adminutils.ConfigRemove(vg.config.EsxHost)
+


If we decide to do this, we must make sure each test clean up itself properly. For example, if a test create a VM group, it must remove the VM group when test end (not only in successful case, but also in failure case). I don't think current test code did this.

lipingxue · 2017-06-08T05:28:35Z

tests/utils/verification/volumeproperties.go

@@ -173,7 +173,7 @@ func VerifyDetachedStatus(name, hostName, esxName string) bool {
 	log.Printf("Confirming detached status for volume [%s]\n", name)

 	//TODO: Need to implement generic polling logic for better reuse
-	for attempt := 0; attempt < 30; attempt++ {
+	for attempt := 0; attempt < 60; attempt++ {


define a const for this instead of using hard coded "60".

I've already added TODO here. We need a general poller util to avoid blind sleeping in the tests - see issue #1301.

Isn't it better to control const variable to adjust attempt until we come to #1301? I would prefer to have const so we don't need to dive into code to adjust it and do some adjustment by controlling const variable.

Fine. Since you both prefer adding consts now, I will add it.

lipingxue · 2017-06-08T05:29:27Z

vmdk_plugin/Makefile

@@ -334,6 +334,7 @@ checkremote:
 # expects binaries to be deployed ot the VM and ESX (see deploy-all target)
 test-vm: checkremote deploy-vm-test
 	$(log_target)
+	-$(SSH) root@$(ESX) '/etc/init.d/vmdk-opsd start'


why we need this here?

This is a workaround for issue #1291 - we need to restart the service because it has crashed due to "Config Remove".

Please remove this file and rebase with master to grab fix for #1291

shuklanirdesh82 · 2017-06-08T15:04:03Z

tests/e2e/basic_test.go

 }

 var _ = Suite(&BasicTestSuite{})

 // Test volume lifecycle management on different datastores:
 // VM1 - local VMFS datastore
 // VM2 - shared VMFS datastore
-// VM3 - shared VSAN datastore
+// VM3 - shared VSAN datastore (TODO: currently not available)


what is the need for VM3 here? if I understand correctly ... VM3 should be registered on the same ESX where VM1 & VM2 are there then it is easily configured on the CI as well as local testbed.

This test is to verify volume creation on different datastores. VM1 is created on local VMFS datastore, VM2 is created on shared VMFS datastore, and VM3 is created on shared VSAN datastore. I will update these comments to avoid confusion.

shaominchen · 2017-06-08T20:29:29Z

@govint @lipingxue @shuklanirdesh82 Please review the latest updates.

lipingxue

LGTM

shuklanirdesh82

Overall looks good to me.

shuklanirdesh82 · 2017-06-09T05:01:21Z

vmdk_plugin/Makefile

@@ -334,6 +334,7 @@ checkremote:
 # expects binaries to be deployed ot the VM and ESX (see deploy-all target)
 test-vm: checkremote deploy-vm-test
 	$(log_target)
+	-$(SSH) root@$(ESX) '/etc/init.d/vmdk-opsd start'


Please remove this file and rebase with master to grab fix for #1291

shaominchen · 2017-06-09T05:17:21Z

Removed the workaround of restarting vmdk-opsd. Please take a look @govint @shuklanirdesh82

govint

Thanks for making the changes. Pls. check note on removing config DB.

govint · 2017-06-09T06:06:20Z

tests/e2e/vmgroups_test.go

@@ -111,6 +111,9 @@ func (vg *VmGroupTest) TearDownSuite(c *C) {
 	out, err = ssh.InvokeCommand(vg.config.EsxHost, cmd)
 	log.Printf(out)

+	// Remove Config DB
+	adminutils.ConfigRemove(vg.config.EsxHost)
+


@lipingxue current test code removes all vmgroups in TearDownSuite() which is always called irrespective of pass/fail of the test suite.

A test must remove all artifacts that are created by it. But why remove the config DB?
After removing all test artifacts is the DB sane, how do we know that it is?

If we remove the config DB (which btw isn't a test artifact) then most likely we are removing bugs as well. If the removal of test artifacts leaves the DB is a corrupt/inconsistent state - whether local or clustered - then shouldn't that be discovered in the testing?

We can't run without the config DB - customers are trying it out and hence it must be in all our tests.

shuklanirdesh82 · 2017-06-09T06:56:47Z

tests/e2e/basic_test.go

+	c.Assert(accessible, Equals, true, Commentf("Volume %s is not available on [%s]", s.volName2, s.vm1))
+
+	accessible = verification.CheckVolumeAvailability(s.vm2, s.volName2)
+	c.Assert(accessible, Equals, false, Commentf("Volume %s is still available on [%s]", s.volName2, s.vm2))


nit: Volume %s is still available on => Volume %s is available on or Volume %s should not be available on

The phrase "... should not be available on" looks fine to me. I searched our exiting test code - it looks like we are all using "is still available" or "is still attached/detached" - please refer to restart_test, vmlistener_test, basic_test, swarm_test, etc. So I'd prefer to keep it consistent.

shuklanirdesh82

LGTM and contingent on the following things.

Need to close on the open comment from @govint's
Need to fix CI failure (not sure you have invoked the test run locally after rebasing with master).

Thanks!

govint

The issue of how the config DB is handled in the tests is to be discussed, doesn't need to block this PR at all.

govint · 2017-06-12T04:21:31Z

@shaominchen, does the swarm test time delays added need to be explained in the docs? Can a customer face the same issue as whats seen with these tests?

Also increased the timeout limit for swarm test

shuklanirdesh82

Please make sure you resolve conflict correctly. Contingent on CI result.

vmwclabot added the cla-not-required label Jun 7, 2017

govint reviewed Jun 8, 2017

View reviewed changes

govint suggested changes Jun 8, 2017

View reviewed changes

lipingxue suggested changes Jun 8, 2017

View reviewed changes

shuklanirdesh82 mentioned this pull request Jun 8, 2017

Swarm test error in CI #1372

Closed

shuklanirdesh82 reviewed Jun 8, 2017

View reviewed changes

shaominchen force-pushed the basic_test branch from 5763dc3 to 0c03a09 Compare June 8, 2017 17:46

lipingxue approved these changes Jun 8, 2017

View reviewed changes

shuklanirdesh82 suggested changes Jun 9, 2017

View reviewed changes

shaominchen force-pushed the basic_test branch from df76303 to f828c88 Compare June 9, 2017 05:15

shuklanirdesh82 assigned shaominchen Jun 9, 2017

shuklanirdesh82 mentioned this pull request Jun 9, 2017

Automate vsan policy test case #1361

Merged

govint suggested changes Jun 9, 2017

View reviewed changes

shuklanirdesh82 reviewed Jun 9, 2017

View reviewed changes

govint approved these changes Jun 12, 2017

View reviewed changes

govint mentioned this pull request Jun 12, 2017

Handle stale attach code #1260

Merged

Shaomin Chen added 8 commits June 12, 2017 10:01

Add a basic test: TestVmGroupVolumeIsolation

fb2fdba

Revert the workaround

bcf636d

Fix CI failure due to issue #1291

dd273cf

Also increased the timeout limit for swarm test

Addressed review comments.

cdd9cc6

Fix naming conflicts within the same package

ae4cddf

Refactor existing tests to use c.TestName() consistently

f370602

Revert the workaround of restart vmdk-opsd

361a5b7

Update vmlistener_test to use new LogTestStart and LogTestEnd functions

69ad5d5

shuklanirdesh82 approved these changes Jun 12, 2017

View reviewed changes

Trivial fix for logging format

c4b490a

shaominchen force-pushed the basic_test branch from f02f2d0 to c4b490a Compare June 12, 2017 18:13

shaominchen mentioned this pull request Jun 12, 2017

Refactor common misc util and update all tests accordingly #1394

Closed

shuklanirdesh82 merged commit 45b3a81 into master Jun 12, 2017

shuklanirdesh82 deleted the basic_test branch June 12, 2017 21:45

Add a basic test: TestVmGroupVolumeIsolation #1368

Add a basic test: TestVmGroupVolumeIsolation #1368

Conversation

shaominchen commented Jun 7, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shuklanirdesh82 Jun 8, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shaominchen Jun 8, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shaominchen Jun 8, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shaominchen commented Jun 8, 2017

lipingxue left a comment

Choose a reason for hiding this comment

shuklanirdesh82 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shaominchen commented Jun 9, 2017

govint left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shuklanirdesh82 left a comment

Choose a reason for hiding this comment

govint left a comment

Choose a reason for hiding this comment

govint commented Jun 12, 2017

shuklanirdesh82 left a comment

Choose a reason for hiding this comment

shaominchen commented Jun 7, 2017 •

edited

Loading

shuklanirdesh82 Jun 8, 2017 •

edited

Loading

shaominchen Jun 8, 2017 •

edited

Loading

shaominchen Jun 8, 2017 •

edited

Loading