Random Replacement #274

aflah02 · 2022-07-20T15:02:25Z

PR for Random Replacement Layer

chenmoneygithub · 2022-07-21T21:16:36Z

/gcbrun

mattdangerw

Thanks! Round of comments on this!

mattdangerw · 2022-09-07T21:02:49Z

keras_nlp/layers/random_replacement.py

+                "provided."
+            )
+
+        countReplaceOptions = [


under_score note camelCase for local variables

mattdangerw · 2022-09-07T21:03:39Z

keras_nlp/layers/random_replacement.py

+                f"Received: rate={rate}"
+            )
+
+        if [self.skip_list, self.skip_fn, self.skip_py_fn].count(None) < 2:


nit: format this the same as the lower check

mattdangerw · 2022-09-08T01:54:47Z

keras_nlp/layers/random_replacement.py

+                num_to_select, self.max_replacements
+            )
+        num_to_select = tf.math.minimum(
+            num_to_select, tf.cast(positions.row_lengths(), tf.int32)


this should not be needed, how could the binomial exceed the positions.row_lengths() given as input?

mattdangerw · 2022-09-08T01:56:26Z

keras_nlp/layers/random_replacement.py

+            # Convert to ragged tensor.
+            inputs = tf.RaggedTensor.from_tensor(inputs)
+
+        skip_masks = None


maybe in the interest of readability pull this whole block into a private _generate_skip_mask() method.

mattdangerw · 2022-09-08T01:58:12Z

keras_nlp/layers/random_replacement.py

+                        seed=self._generator.make_seeds()[:, 0],
+                    )
+                ]
+                synonym = inputs[index]


Careful to name this inferring the use case too much. Choose name like original_token, replacement_token, that don't assume a word or character level usage.

mattdangerw · 2022-09-08T01:58:38Z

keras_nlp/layers/random_replacement.py

+                ]
+                synonym = inputs[index]
+
+                if self.replacement_fn is not None:


Can we pull out a private method _replace_token() to make this more readable?

mattdangerw · 2022-09-08T02:22:36Z

keras_nlp/layers/random_replacement.py

+
+                if self.replacement_fn is not None:
+                    synonym = self.replacement_fn(synonym)
+                    inputs = tf.tensor_scatter_nd_update(


I'm skeptical we really need a map_fn with a loop inside it, with a scatter inside it. That sounds like it will be inefficient.

If we follow the flow we have in random deletion, we will have a complete mask containing only the indices we want to run a deletion on right? Then we could do something like run a tf.map_fn over the pair of (inputs.flat_values, mask.flat_values) and early return if mask == false, otherwise we lookup a replacement.

There might be other ways to do this, but overall we should:

avoid nested maps/loops

avoid scatters inside a map/loop

aflah02 added 5 commits June 24, 2022 15:20

Working

087a0cc

DocString Done

84b5884

In Progress

3eb4b8d

Finished

c6d9b94

Moved to statless_uniform

da7db2a

aflah02 added 16 commits August 3, 2022 18:29

Integrated Similar Logic as RandomDeletion

f0c2cf9

Docstring Improvement

8ff3787

Fixes

fb65aa5

Merge branch 'keras-team:master' into RandomReplacement

e062841

Fixed Tests

0db6c1d

Fixes in Docstring

fb2eea9

Improved Docstring

04fc189

Docstring Fixed

8604c7e

Typo Fix

6dce493

Improved Tests

8e357f0

Matched Deletion API

d3d35ff

Removed Reduntant Line

6da78b6

Merging

082b465

Merge branch 'keras-team:master' into RandomReplacement

f3562bb

Updated to match Swap Fixes

7850514

Improved Regex Tests

91001ee

mattdangerw self-requested a review September 7, 2022 17:02

mattdangerw requested changes Sep 8, 2022

View reviewed changes

mattdangerw self-assigned this Mar 22, 2023

mattdangerw closed this Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random Replacement #274

Random Replacement #274

aflah02 commented Jul 20, 2022

chenmoneygithub commented Jul 21, 2022

mattdangerw left a comment

mattdangerw Sep 7, 2022

mattdangerw Sep 7, 2022

mattdangerw Sep 8, 2022

mattdangerw Sep 8, 2022

mattdangerw Sep 8, 2022

mattdangerw Sep 8, 2022

mattdangerw Sep 8, 2022

Random Replacement #274

Random Replacement #274

Conversation

aflah02 commented Jul 20, 2022

chenmoneygithub commented Jul 21, 2022

mattdangerw left a comment

Choose a reason for hiding this comment

mattdangerw Sep 7, 2022

Choose a reason for hiding this comment

mattdangerw Sep 7, 2022

Choose a reason for hiding this comment

mattdangerw Sep 8, 2022

Choose a reason for hiding this comment

mattdangerw Sep 8, 2022

Choose a reason for hiding this comment

mattdangerw Sep 8, 2022

Choose a reason for hiding this comment

mattdangerw Sep 8, 2022

Choose a reason for hiding this comment

mattdangerw Sep 8, 2022

Choose a reason for hiding this comment