LIU-402: Improving regression testing in anticipation of schema refactor #278

myxie · 2024-08-27T02:54:36Z

Note: Merge after #280. This features some of those changes, which is why this branch is targeting LIU-404. Update this branch to merge to master once #280 has been merged.

Issue

Moving towards the new JSON schema, we want to have a comprehensive coverage of the existing logical and physical graph translator code, so we comfortably make changes without fear of introducing regressions that go unnoticed (See https://icrar.atlassian.net/browse/LIU-402 for more information).

There are also some frustrating examples of updating class-variables using proxy local variables in the translator code, which would benefit from being removed as they make the code much harder to read and navigate.

Solution

The following has been achieved in this PR:

Setup more tests for LG/LGNodes
Provide example refactor in the form of a non-recursive lgn_to_pgn
Remove unnecessary local-class variables from PGT/Scheduler code that makes navigating it much harder.

The refactor example demonstrates the value of the new test cases.

Summary by Sourcery

Set up regression testing for the schema refactor by adding extensive tests for logical and physical graph translators. Introduce a non-recursive implementation of the lgn_to_pgn method and refactor code to improve readability by removing unnecessary variables and using a new utility module for file path handling.

New Features:

Introduce a non-recursive implementation of the lgn_to_pgn method in the logical graph translator.

Enhancements:

Remove unnecessary local-class variables from the PGT/Scheduler code to improve code readability and navigation.
Refactor the code to use a new utility module path_utils for handling file paths, replacing previous inline implementations.

Tests:

Add comprehensive regression tests for logical and physical graph translator code to ensure backward compatibility and prevent unnoticed regressions.
Implement tests for the non-recursive lgn_to_pgn method to demonstrate its functionality and validate its output.

- Setup more tests for LG/LGNodes - Provide example refactor in the form of a non-recursive lgn_to_pgn - Remove unnecessary local-class variables from PGT/Scheduler code that makes navigating it much harder.

sourcery-ai · 2024-08-30T08:06:17Z

Reviewer's Guide by Sourcery

This pull request implements initial work on setting up regression testing for a schema refactor in the DALiuGE project. The changes focus on improving test coverage for logical and physical graph translator code, refactoring some parts of the code, and removing unnecessary local-class variables to enhance code readability.

File-Level Changes

Change	Details	Files
Added new test cases for LG/LGNodes	Created TestLGInit class to test LG object construction Implemented TestLGNToPGN class to verify LGN to PGN conversion Added TestLGUnroll class to test LG unrolling process Introduced TestLGNodeLoading class to test SubGraph data node storage	`daliuge-translator/test/dropmake/test_lg.py`
Refactored lgn_to_pgn method to include a non-recursive option	Added 'recursive' parameter to lgn_to_pgn method Implemented non-recursive logic for traversing child nodes Updated related code to support both recursive and non-recursive approaches	`daliuge-translator/dlg/dropmake/lg.py`
Removed unnecessary local-class variables from PGT/Scheduler code	Eliminated redundant variable assignments Simplified code by directly using class attributes instead of local variables Removed unused variables and imports	`daliuge-translator/dlg/dropmake/scheduler.py` `daliuge-translator/dlg/dropmake/pgt.py`
Improved code organization and modularity	Created new path_utils.py file for handling file paths Moved test-related utility functions to separate files Refactored import statements to use new utility modules	`daliuge-translator/dlg/dropmake/path_utils.py` `daliuge-translator/test/dropmake/test_pg_gen.py`
Enhanced error handling and type safety	Added type hints and docstrings to improve code clarity Implemented more robust error checking in various functions Updated exception handling to provide more informative error messages	`daliuge-translator/dlg/dropmake/lg_node.py` `daliuge-translator/dlg/dropmake/dm_utils.py`

Tips

Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
Continue your discussion with Sourcery by replying directly to review comments.
You can change your review settings at any time by accessing your dashboard:
- Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
- Change the review language;
You can always contact us if you have any questions or feedback.

sourcery-ai

Hey @myxie - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟡 General issues: 3 issues found
🟢 Security: all looks good
🟢 Testing: all looks good
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.}

sourcery-ai · 2024-08-30T08:07:42Z

daliuge-translator/dlg/dropmake/scheduler.py

            # print("N (makespan) is ", N, "M is ", M)
-            ma = np.zeros((M, N), dtype=int)


suggestion: Consider impact on readability of using self._max_dop directly

While using self._max_dop directly is more consistent, consider if a local variable M = self._max_dop might improve readability in this method.

sourcery-ai · 2024-08-30T08:07:42Z

daliuge-translator/dlg/dropmake/lg.py

+                else:
+                    for child in lgn.children:
+                    # Approach next 'set' of children
+                        c_copy = copy.deepcopy(child)


suggestion (performance): Consider performance impact of using deepcopy

Using copy.deepcopy can be expensive for large objects. Consider if a shallow copy would suffice, or if this operation is necessary at all.

c_copy = copy.copy(child) if hasattr(child, '__dict__'): c_copy.__dict__ = copy.copy(child.__dict__)

sourcery-ai · 2024-08-30T08:07:42Z

daliuge-translator/dlg/dropmake/pgt.py

-        lm = self._oid_gid_map
-        lm2 = self._gid_island_id_map


suggestion: Consider keeping descriptive variable names

While removing these variables simplifies the code, the descriptive names 'lm' and 'lm2' might have aided in understanding. Consider adding a comment explaining what these maps represent.

# Map of object IDs to group IDs oid_gid_map = self._oid_gid_map # Map of group IDs to island IDs gid_island_id_map = self._gid_island_id_map # when #partitions < #nodes the oid_gid_map values are spread around range(#nodes) # which leads to index out of range errors (TODO: find how _oid_gid_map is

daliuge-translator/dlg/dropmake/lg.py

sourcery-ai · 2024-08-30T08:07:42Z

daliuge-translator/test/dropmake/test_lg.py

+        for lg_name, num_keys in self.lg_names.items():
+            fp = path_utils.get_lg_fpath("logical_graphs", lg_name)
+            lg = LG(fp, ssid=TEST_SSID)
+            self.assertEqual(num_keys,
+                             len(lg._done_dict.keys()),
+                             f"Incorrect number of elements when constructing LG "
+                             f"object using: {lg_name}")


issue (code-quality): Avoid loops in tests. (no-loop-in-tests)

Explanation
Avoid complex code, like loops, in test functions.
Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:

loops

conditionals

Some ways to fix this:

Use parametrized tests to get rid of the loop.

Move the complex logic into helpers.

Move the complex part into pytest fixtures.

Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.

Software Engineering at Google / Don't Put Logic in Tests

sourcery-ai · 2024-08-30T08:07:44Z

daliuge-translator/dlg/dropmake/scheduler.py

+        sk = "".join([str(int(round(xi))) for xi in x[0: self._topk]])
        stuff = self._sspace_dict.get(sk, None)
-        if stuff is None:
+        if not stuff:
            G = self._lite_dag.copy()
            stuff = self._partition_G(G, x)
            self._sspace_dict[sk] = stuff[0:2]


issue (code-quality): Replace a[0:x] with a[:x] and a[x:len(a)] with a[x:] [×2] (remove-redundant-slice-index)

sourcery-ai · 2024-08-30T08:07:44Z

daliuge-translator/dlg/dropmake/scheduler.py

@@ -1046,11 +1033,11 @@
        indices of x is identical to the indices in G.edges().sort(key='weight')
        """
        # first check if the solution is already available in the search space
-        sk = "".join([str(int(round(xi))) for xi in x[0 : self._topk]])
+        sk = "".join([str(int(round(xi))) for xi in x[0: self._topk]])


issue (code-quality): We've found these issues:

Replace a[0:x] with a[:x] and a[x:len(a)] with a[x:] [×2] (remove-redundant-slice-index)

Replace if statement with if expression (assign-if-exp)

sourcery-ai · 2024-08-30T08:07:44Z

daliuge-translator/dlg/dropmake/scheduler.py

@@ -1264,7 +1251,7 @@

    @staticmethod
    def build_dag_from_drops(


issue (code-quality): We've found these issues:

Merge duplicate blocks in conditional (merge-duplicate-blocks)

Remove redundant conditional (remove-redundant-if)

Replace multiple comparisons of same variable with in operator (merge-comparisons)

Low code quality found in DAGUtil.build_dag_from_drops - 16% (low-code-quality)

Explanation

The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines.

Reduce nesting, perhaps by introducing guard clauses to return early.

Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.

sourcery-ai · 2024-08-30T08:07:44Z

daliuge-translator/test/dropmake/test_lg.py

+    iterate through them and get the number of Physical Graph Nodes for every one of
+    the Logical Graph nodes.
+    """
+    return sum([len(drop_list) for drop_list in drop_values])


suggestion (code-quality): Replace unneeded comprehension with generator (comprehension-to-generator)

Suggested change

return sum([len(drop_list) for drop_list in drop_values])

return sum(len(drop_list) for drop_list in drop_values)

sourcery-ai · 2024-08-30T08:07:44Z

daliuge-translator/test/dropmake/test_lg.py

+        for lgn in lg_non_recursive._start_list:
+            lg_non_recursive.lgn_to_pgn(lgn, recursive=False)
+
+        expected_test_loop_drops = 11


issue (code-quality): Extract duplicate code into method (extract-duplicate-method)

myxie added 3 commits August 27, 2024 10:52

LIU-402: Initial work setting regression testing for schema refactor

ccb0eb9

- Setup more tests for LG/LGNodes - Provide example refactor in the form of a non-recursive lgn_to_pgn - Remove unnecessary local-class variables from PGT/Scheduler code that makes navigating it much harder.

Merge branch 'LIU-404' of https://github.com/icrar/daliuge into LIU-402

cec5910

LIU-402: Complete separation of logical graph tests.

cfd8cde

myxie changed the base branch from master to LIU-404 August 30, 2024 07:59

myxie marked this pull request as ready for review August 30, 2024 08:06

sourcery-ai bot reviewed Aug 30, 2024

View reviewed changes

myxie changed the title ~~LIU-402: Initial work setting regression testing for schema refactor~~ LIU-402: Improving regression testing in anticipation of schema refactor Aug 30, 2024

awicenec force-pushed the LIU-402 branch from 872e7c0 to cfd8cde Compare September 9, 2024 08:45

awicenec force-pushed the LIU-404 branch from 9780591 to a0ce7b9 Compare September 9, 2024 08:45

Base automatically changed from LIU-404 to master September 25, 2024 04:51

myxie mentioned this pull request Sep 25, 2024

LIU-382: Demonstrate transition to importlib.resources #264

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LIU-402: Improving regression testing in anticipation of schema refactor #278

LIU-402: Improving regression testing in anticipation of schema refactor #278

myxie commented Aug 27, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Aug 30, 2024 •

edited

Loading

sourcery-ai bot left a comment

sourcery-ai bot Aug 30, 2024

sourcery-ai bot Aug 30, 2024

sourcery-ai bot Aug 30, 2024

sourcery-ai bot Aug 30, 2024

sourcery-ai bot Aug 30, 2024

sourcery-ai bot Aug 30, 2024

sourcery-ai bot Aug 30, 2024

sourcery-ai bot Aug 30, 2024

sourcery-ai bot Aug 30, 2024

		# print("N (makespan) is ", N, "M is ", M)
		ma = np.zeros((M, N), dtype=int)

	return sum([len(drop_list) for drop_list in drop_values])
	return sum(len(drop_list) for drop_list in drop_values)

LIU-402: Improving regression testing in anticipation of schema refactor #278

Are you sure you want to change the base?

LIU-402: Improving regression testing in anticipation of schema refactor #278

Conversation

myxie commented Aug 27, 2024 • edited by sourcery-ai bot Loading

Issue

Solution

Summary by Sourcery

sourcery-ai bot commented Aug 30, 2024 • edited Loading

Reviewer's Guide by Sourcery

File-Level Changes

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Aug 30, 2024

Choose a reason for hiding this comment

sourcery-ai bot Aug 30, 2024

Choose a reason for hiding this comment

sourcery-ai bot Aug 30, 2024

Choose a reason for hiding this comment

sourcery-ai bot Aug 30, 2024

Choose a reason for hiding this comment

sourcery-ai bot Aug 30, 2024

Choose a reason for hiding this comment

sourcery-ai bot Aug 30, 2024

Choose a reason for hiding this comment

sourcery-ai bot Aug 30, 2024

Choose a reason for hiding this comment

sourcery-ai bot Aug 30, 2024

Choose a reason for hiding this comment

sourcery-ai bot Aug 30, 2024

Choose a reason for hiding this comment

myxie commented Aug 27, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Aug 30, 2024 •

edited

Loading