Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unquote/unpacking improvements #1545

Closed
wants to merge 3 commits into from

Conversation

vodik
Copy link
Contributor

@vodik vodik commented Mar 25, 2018

A pile of changes dealing with some ugly corner cases in the language

Fixes #1519

The problem here is calling replace can trigger a call to replace_hy_obj, which will try to also wrap values inside a HyObject.

However, this means when we try to recursively replace inside a nested object, like a HyList, we should try and capture these new wrapped objects.

Hy still works if we don't, but it results in the object getting rewrapped multiple times during a compile.

But with that change in place, wrap_value should now only be called once per expression. This opens the door to safely wrap iterables. So instead of crashing when trying to wrap an iterable, first expand it and then try to wrap the resulting list.

This allows unquote to work with iterators and generators, which previously didn't work unless you used unquote-splice or explicitly expanded a list first.

And lastly, use PEP 448 for better unquoted expressions (inspired by @gilch's comment in #1542)

With Python 3.5, we can have an arbitrary number of unpackings. We can use this to emit cleaner expressions.

@vodik vodik changed the title Unpacking improvements unquote/unpacking improvements Mar 25, 2018
@vodik vodik force-pushed the unpacking-improvements branch 2 times, most recently from e8feade to bfff295 Compare March 25, 2018 07:08
Calling `replace` can trigger a call to `replace_hy_obj`, which will
try to also wrap values inside a `HyObject`.

However, this means when we try to recusively replace inside a nested
object, like a `HyList`, we should try and capture these new wrapped
objects.

Hy still works if we don't, but it results in the object getting
rewrapped multiple times during a compile.

Closes hylang#1519
Instead of crashing when trying to unwrap an iterable, first expand it
and then try to wrap the resulting list.

This allows `unquote` to work with iterators and generators, which
previously didn't work unless you used `unquote-splice` or explicitly
expanded a list first.
With Python 3.5, we can have an arbitrary number of unpackings. We can
use this to emit cleaner expressions.

Falls back to legacy behaviour for older Python versions.
@vodik
Copy link
Contributor Author

vodik commented Mar 25, 2018

So the big use case for this patch is letting map and unquote work well together.

I started hitting this when looking at some Clojure macro tutorials to brush up. The following snippet doesn't work currently on Python 3 because map is a generator:

(eval-and-compile
  (defn criticize-code
    [criticism code]
    `(print ~criticism (quote ~code))))

(defmacro code-critic
  [bad good]
  `(do ~(map (fn [x] (criticize-code #* x))
        [["Cursed bacteria of Liberia, this is bad code:" bad]
         ["Sweet sacred boa of Western and Eastern Samoa, this is good code:" good]])))

(code-critic (1 + 1) (+ 1 1))

But it feels natural to write. I understand tossing an explicit list in there will make it work, but it also makes it noisier.

I think we should embrace the strengths of Python and try to leverage generators like this.

@vodik
Copy link
Contributor Author

vodik commented Mar 25, 2018

And I've been trying to think what would be the best wording for the NEWS entry because of the generator handling:

=> (eval-and-compile
... (defn numbers [] (yield-from [1 2 3])))
=> (quote (sum (unquote (numbers))))
'(sum ~(numbers))
=> (quasiquote (sum (unquote (numbers))))
'(sum <generator object numbers at 0x7ff6050f5bf8>)
=> (eval *1)
6

Or if that generator object needs to be expanded directly, but not sure how to do it.

@vodik
Copy link
Contributor Author

vodik commented Mar 25, 2018

Yeah, something interesting is happening here for sure: it works, but:

=> (setv sym `(sum ~(genexpr x [x (range 10)] (odd? x))))
=> sym
'(sum <generator object <genexpr> at 0x7fa63dc589e8>)
=> (eval sym)
25
=> sym
'(sum [1 3 5 7 9])

@vodik
Copy link
Contributor Author

vodik commented Mar 25, 2018

Okay, so addressing the generator rewriting, I have a fix for it, but its not trivial and I can only make it work by completely breaking line numbering...

The problem is that we should probably be calling wrap_obj in HyList constructor. I think we probably want to do that anyways as other Hy objects (like HyCons) do wraps on construction.

But that opens a whole different can of worms.

It seems we throw away line numbering information on a lot of intermediate forms inside the compiler (for example, slicing a HyExpression results in a new HyExpression without line numbers) which is fixed up later. Wrapping forms in HyList.__init__ starts exposing lots of missing line number bugs for reasons I haven't determined yet.

So probably there are some subtle bugs here related to #1542.

Don't know how to proceed. I'd be comfortable merging it with the mutation and then trying to fix these issues as I really would like to see the functionality work.

Any ideas?

@gilch
Copy link
Member

gilch commented Mar 25, 2018

We could make a core tag macro to convert to tuple:

(deftag , [iterable]
  `(tuple ~iterable))

Python3 uses iterators a lot more than Python2 did, so realizing one into a tuple is something we do often. This would make using generators in macros pretty easy.

=> #,(range 10)
tuple(range(10))

(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

@vodik
Copy link
Contributor Author

vodik commented Mar 25, 2018

I actually really like that idea, should we do both? Or are you suggesting this as an alternative?

I mean, ~#,(map ...) seems a little noisy to me, and I think its inconsistent that ~@(map ...) works just.

I do think we should embrace generators in the core like this considering all the extensive places that Python 3 uses them and everywhere they're accepted.

@Kodiologist
Copy link
Member

This PR looks pretty good to me as is (but don't forget NEWS, of course).

I don't think the new promotion semantics for generators are strictly necessary, because you can always write (, ~@foo) or [~@foo] and then it's explicit what you're getting, but I'm okay with them.

Your example is kind of inevitable considering that iterators in Python mutate whenever you take items from them.

The problem is that we should probably be calling wrap_obj in HyList constructor. I think we probably want to do that anyways as other Hy objects (like HyCons) do wraps on construction.

I don't think we should promote things to Hy models in the model constructors because promotion should happen in only one place, which is just before compilation. HyCons is the only exception, and even that may ultimately be wrong, because HyCons has always been pathological (#568).

@gilch
Copy link
Member

gilch commented Mar 25, 2018

Generators are mutable, may have side effects, and can only be read once. Laziness is great, but iterators are a pain to use compared to Clojure's seq abstraction. hylang/hyrule#32.

I'm confused by your motivating examples. The output isn't using the Hy model reprs, and the code critic isn't indented right. After counting brackets, I get,

(defmacro code-critic
  [bad good]
  `(do ~(map (fn [x] (criticize-code #* x))
             [["Cursed bacteria of Liberia, this is bad code:" bad]
              ["Sweet sacred boa of Western and Eastern Samoa, this is good code:" good]])))

So is the intended expansion is like,

(do
  (print "Cursed bacteria of Liberia, this is bad code:" '(1 + 1))
  (print "Sweet sacred boa of Western and Eastern Samoa, this is good code:" '(+ 1 1)))

?
But in that case you need to splice with ~@! I don't see how making ~ act like ~@ when it happens to be unquoting a generator is a good idea at all. Let's be consistent here.

Here's another example with the default pretty repr and --spy.

=> (setv sym `(sum ~(genexpr x [x (range 10)] (odd? x))))
from hy.core.language import is_odd
from hy import HyExpression, HySymbol
sym = HyExpression([] + [HySymbol('sum')] + [(x for x in range(10) if
    is_odd(x))])
None

=> sym
sym

HyExpression([
  HySymbol('sum'),
  <generator object <genexpr> at 0x000002B637B10B48>])

Of course it's a generator here, what did you expect? What bothers me is that eval can compile this at all, since it's not a proper hytree. And in fact, it doesn't.

=> (eval sym)
from hy.core.language import eval
eval(sym)

Traceback (most recent call last):
  ...
TypeError: Don't know how to wrap a <class 'generator'> object to a HyObject

Now suppose we had converted this to a tuple,

=> sym
sym

HyExpression([
  HySymbol('sum'),
  (1, 3, 5, 7, 9)])
=> (eval sym)
from hy.core.language import eval
eval(sym)

25

It works, but it probably shouldn't. Using cons shows why.

=> (cons (, 1 2 3) 1)
from hy.core.language import cons
cons((1, 2, 3), 1)

<HyCons (
  HyList([
    HyInteger(1),
    HyInteger(2),
    HyInteger(3)])
. HyInteger(1))>

The tuple isn't inserted as a tuple. It's getting autopromoted into a HyList. Hy is strange for a Lisp in that its model types are distinct from the data they represent. In Clojure there's no distinction between, say, HyInteger and int. Hy can autopromote certain datatypes (it didn't always #1314), but other types are not representable as Hy models. #919. There doesn't seem to be a way to insert arbitrary datatypes into the ast. Only Python code is allowed. I find the autoconversion of tuples to HyList a questionable choice to begin with.

@Kodiologist
Copy link
Member

You're basically objecting to #1314, then.

@Kodiologist
Copy link
Member

I mean, making the user promote everything by hand to the appropriate Hy model type would be pretty annoying. Right?

@gilch
Copy link
Member

gilch commented Mar 25, 2018

Not all of it maybe. The promotion of the atomic types is certainly convenient. But I'm less comfortable with the compound types. We can splice them in like [~@foo] when constructing macros, without referring to HyList and friends explicitly. This would be more consistent with the generators.

@gilch
Copy link
Member

gilch commented Mar 26, 2018

I mentioned in #1542 that `(0 ~@[1 2 3] 4) could compile to
HyExpression([HyInteger(0), *list([1, 2, 3] or []), HyInteger(4)]),
but it's probably more efficient to use tuples instead of lists,
HyExpression([HyInteger(0), *tuple([1, 2, 3] or ()), HyInteger(4)]).

There's also a hygene issue that I mentioned in #1407. Shadowing list with a local can break quasiquotes. It's probably safer to use something like __import__("builtins").tuple or __import__("builtins").list here. We have not been careful enough in the compiler. This does add noise to the output, which we were trying to clean up here. We need a better system for macro hygiene.
HyExpression([HyInteger(0), *__import__("builtins").tuple([1, 2, 3] or ()), HyInteger(4)])
Even if we don't do the __import__, tuple is less likely to cause an accidental capture than list.

if not PY35:
# Without Python 3.5's unpacking improvements, we need
# to concatenate the expression manually
contents = HyExpression([HySymbol("sum"), contents, HyList()])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring of sum says "This function is intended specifically for use with numeric values and may reject non-numeric types." It's an implementation detail that this works at all. Other Python implementations or future Python versions may break this and a type checker has every right to flag this kind of use as a type error for not using numbers. Hy's + special form is already variadic, why not use that instead? Note that the unary + calls __pos__ instead of __add__, but you can force it to use the binary form by starting with an empty list.

@Kodiologist
Copy link
Member

I'm closing this for lack of response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Looks like replace does unnecessary work
3 participants