-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LIU-402: Improving regression testing in anticipation of schema refactor #278
base: master
Are you sure you want to change the base?
Conversation
- Setup more tests for LG/LGNodes - Provide example refactor in the form of a non-recursive lgn_to_pgn - Remove unnecessary local-class variables from PGT/Scheduler code that makes navigating it much harder.
Reviewer's Guide by SourceryThis pull request implements initial work on setting up regression testing for a schema refactor in the DALiuGE project. The changes focus on improving test coverage for logical and physical graph translator code, refactoring some parts of the code, and removing unnecessary local-class variables to enhance code readability. File-Level Changes
Tips
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @myxie - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟡 General issues: 3 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.
# print("N (makespan) is ", N, "M is ", M) | ||
ma = np.zeros((M, N), dtype=int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Consider impact on readability of using self._max_dop directly
While using self._max_dop directly is more consistent, consider if a local variable M = self._max_dop might improve readability in this method.
else: | ||
for child in lgn.children: | ||
# Approach next 'set' of children | ||
c_copy = copy.deepcopy(child) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (performance): Consider performance impact of using deepcopy
Using copy.deepcopy can be expensive for large objects. Consider if a shallow copy would suffice, or if this operation is necessary at all.
c_copy = copy.copy(child)
if hasattr(child, '__dict__'):
c_copy.__dict__ = copy.copy(child.__dict__)
lm = self._oid_gid_map | ||
lm2 = self._gid_island_id_map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Consider keeping descriptive variable names
While removing these variables simplifies the code, the descriptive names 'lm' and 'lm2' might have aided in understanding. Consider adding a comment explaining what these maps represent.
# Map of object IDs to group IDs
oid_gid_map = self._oid_gid_map
# Map of group IDs to island IDs
gid_island_id_map = self._gid_island_id_map
# when #partitions < #nodes the oid_gid_map values are spread around range(#nodes)
# which leads to index out of range errors (TODO: find how _oid_gid_map is
for lg_name, num_keys in self.lg_names.items(): | ||
fp = path_utils.get_lg_fpath("logical_graphs", lg_name) | ||
lg = LG(fp, ssid=TEST_SSID) | ||
self.assertEqual(num_keys, | ||
len(lg._done_dict.keys()), | ||
f"Incorrect number of elements when constructing LG " | ||
f"object using: {lg_name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Avoid loops in tests. (no-loop-in-tests
)
Explanation
Avoid complex code, like loops, in test functions.Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
- loops
- conditionals
Some ways to fix this:
- Use parametrized tests to get rid of the loop.
- Move the complex logic into helpers.
- Move the complex part into pytest fixtures.
Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.
Software Engineering at Google / Don't Put Logic in Tests
sk = "".join([str(int(round(xi))) for xi in x[0: self._topk]]) | ||
stuff = self._sspace_dict.get(sk, None) | ||
if stuff is None: | ||
if not stuff: | ||
G = self._lite_dag.copy() | ||
stuff = self._partition_G(G, x) | ||
self._sspace_dict[sk] = stuff[0:2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Replace a[0:x] with a[:x] and a[x:len(a)] with a[x:] [×2] (remove-redundant-slice-index
)
@@ -1046,11 +1033,11 @@ | |||
indices of x is identical to the indices in G.edges().sort(key='weight') | |||
""" | |||
# first check if the solution is already available in the search space | |||
sk = "".join([str(int(round(xi))) for xi in x[0 : self._topk]]) | |||
sk = "".join([str(int(round(xi))) for xi in x[0: self._topk]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): We've found these issues:
- Replace a[0:x] with a[:x] and a[x:len(a)] with a[x:] [×2] (
remove-redundant-slice-index
) - Replace if statement with if expression (
assign-if-exp
)
@@ -1264,7 +1251,7 @@ | |||
|
|||
@staticmethod | |||
def build_dag_from_drops( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): We've found these issues:
- Merge duplicate blocks in conditional (
merge-duplicate-blocks
) - Remove redundant conditional (
remove-redundant-if
) - Replace multiple comparisons of same variable with
in
operator (merge-comparisons
) - Low code quality found in DAGUtil.build_dag_from_drops - 16% (
low-code-quality
)
Explanation
The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.
How can you solve this?
It might be worth refactoring this function to make it shorter and more readable.
- Reduce the function length by extracting pieces of functionality out into
their own functions. This is the most important thing you can do - ideally a
function should be less than 10 lines. - Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
sits together within the function rather than being scattered.
iterate through them and get the number of Physical Graph Nodes for every one of | ||
the Logical Graph nodes. | ||
""" | ||
return sum([len(drop_list) for drop_list in drop_values]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace unneeded comprehension with generator (comprehension-to-generator
)
return sum([len(drop_list) for drop_list in drop_values]) | |
return sum(len(drop_list) for drop_list in drop_values) |
for lgn in lg_non_recursive._start_list: | ||
lg_non_recursive.lgn_to_pgn(lgn, recursive=False) | ||
|
||
expected_test_loop_drops = 11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Extract duplicate code into method (extract-duplicate-method
)
Note: Merge after #280. This features some of those changes, which is why this branch is targeting LIU-404. Update this branch to merge to
master
once #280 has been merged.Issue
Moving towards the new JSON schema, we want to have a comprehensive coverage of the existing logical and physical graph translator code, so we comfortably make changes without fear of introducing regressions that go unnoticed (See https://icrar.atlassian.net/browse/LIU-402 for more information).
There are also some frustrating examples of updating class-variables using proxy local variables in the translator code, which would benefit from being removed as they make the code much harder to read and navigate.
Solution
The following has been achieved in this PR:
The refactor example demonstrates the value of the new test cases.
Summary by Sourcery
Set up regression testing for the schema refactor by adding extensive tests for logical and physical graph translators. Introduce a non-recursive implementation of the
lgn_to_pgn
method and refactor code to improve readability by removing unnecessary variables and using a new utility module for file path handling.New Features:
lgn_to_pgn
method in the logical graph translator.Enhancements:
path_utils
for handling file paths, replacing previous inline implementations.Tests:
lgn_to_pgn
method to demonstrate its functionality and validate its output.