Refactor `COMPRESS-INSTRUCTIONS` #830

markasoftware · 2022-06-30T16:07:01Z

Passes all tests, and produces output just as good as the original implementation on some basic benchmarks I ran.

(commits should probably be squashed)

I believe it's sound, but does not always optimize as well -- it does not yet implement some of the repeated passes that the old COMPRESS-INSTRUCTIONS did.

I believe the rewrite is more or less up to the same performance level as the original (passes tests, but manual testing in REPL indicates its sequences are often a few gates longer). However, there is some bug wrt state prep that is causing some tests still to fail.

IT's not 100% functionally equivalent with the original, so I still need to run some manual tests to evaluate its performance.

In last commit, flushed all existing queues. I misread the original implementation, which in fact only flushed the sub-queue which was in "passing" mode, which is the queue involving the greatest number of qubits. As of this commit, the output of the new COMPRESS-INSTRUCTIONS seems just as good as the original implementation.

markasoftware · 2022-06-30T16:36:23Z

While the output is just as short as the output from the original COMPRESS-INSTRUCTIONS, my implementation is not exactly functionally identical. The main difference is that the original requires all queues to correspond to actual hardware objects, and has one "global" queue which does not correspond to a hardware object. My implementation allows each queue to use an arbitrary set of resources, not necessarily corresponding to a real hardware object, and does not have a concept of a "global" queue. This means my implementation can have multiple 3-qubit queues simultaneously, for example.

If these details of the original implementation are important for some reason (maybe speed in certain situations?), adding them to my branch will be easy.

braised-babbage · 2022-06-30T21:52:54Z

This is well written code, and notably shorter than what it replaces, so to the extent that it behaves similarly (as a functional component of quilc) to what we had before, I'm happy to see it merged.

However, I am a bit hesitant to encourage piecemeal changes at this stage, in the sense that I don't think we have an entirely clear perspective on the compressor, i.e. what we want from it, what we expect, what sort of compromises we are making, and so forth. I have some limited perspective, of course: I know that we want peephole optimization to clean up after the translation to native code, I know that we additionally support rather generic "decompilation" for certain arity (e.g. take a 1Q sequence to ZXZ), and so forth. But to the extent that the compressor represents a kind of heuristic solution to some well posed but intractable problem, it might be valuable to articulate that problem a bit more explicitly. In other words, I think this PR looks good, but (to me) it's most exciting when it is in the context of a larger renewal of thinking on how quilc does its job.

Along similar lines, I wonder the extent to which we can or should think of compression as simply being a one-size fits all mechanism "parameterized" by the chip specification and defined compilers. As you note the original implementation makes a conscious decision to associate queues to hardware objects. I don't take this by itself as a good or bad decision, but I do take it as indicative that the original design envisioned a certain relationship between our understanding of "compression" and the device that we are compiling to.

One can imagine quite disparate architectures

the traditionally supported CZ/CNOT + 1Q rotations
ion trap devices with native high-arity gates
exchange only quantum dots
"discrete" gatesets (e.g. the Clifford + T work that recently gone into quilc)

I suspect (but might be wrong) that many of the comparisons made in comparing this PR against the old compressor are with respect to the first item in that list, because this is mostly what we have tested & benchmarked quilc on. It is in the spirit of quilc that many of the difference between the aforementioned architectures can come simply down to the available compilers on the chip and how they are exposed via hardware objects. On the other hand, it is not obvious to me that compressing well is quite as simple as that.

markasoftware · 2022-06-30T23:12:56Z

The purpose of this PR is not to change the paradigm of compression in quilc from chip-specific to generic. In fact, my goal is not to change the behavior of COMPRESS-INSTRUCTIONS at all. I only changed the behavior when I believed the original behavior to be more a quirk of the implementation rather than an intentional, well-thought-out choice. (The old implementation creates all the state machines/queues that it needs right at the start. If your implementation is going to precreate all its queues, then hardware objects are a reasonable way to pick those queues. But in an implementation like my own where the queues are created on-demand, there is no reason to limit yourself to hardware objects afaik).

A more philosophical discussion about the purpose of the compressor is certainly a good idea before I start really messing with it in later PRs.

braised-babbage · 2022-06-30T23:26:44Z

The purpose of this PR is not to change the paradigm of compression in quilc from chip-specific to generic. In fact, my goal is not to change the behavior of COMPRESS-INSTRUCTIONS at all. I only changed the behavior when I believed the original behavior to be more a quirk of the implementation rather than an intentional, well-thought-out choice. (The old implementation creates all the state machines/queues that it needs right at the start. If your implementation is going to precreate all its queues, then hardware objects are a reasonable way to pick those queues. But in an implementation like my own where the queues are created on-demand, there is no reason to limit yourself to hardware objects afaik).

A more philosophical discussion about the purpose of the compressor is certainly a good idea before I start really messing with it in later PRs.

That's very fair, and I didn't intend to impute any additional meaning to these changes than what you had written in your earlier comment. The second half of my remark was more just me running down a tangent. Regardless, I think this PR is good!

src/compressor/compressor.lisp

ecpeterson

I’m glad to see this. I agree that the compressor was not designed by pure principle, particularly around the “global” queue, and I think simplifying it is a good idea. Even if it were to underperform the old implementation, the new simplicity is appealing enough to me that I’d rather build atop it than hold on to the old one.

I also agree with Erik that the coming moments are a good opportunity to try to capture what the principle of a compressor should be and to pin further modifications against that. For that matter, while we’re all here, I’d like to draw everyone’s attention back to #486. Our compressor does some things that their rewriter cannot, and vice versa, which should lead both to an interesting integration problem and to fruitful food for thought about the definition of a compressor.

My only PR comment is superficial: there’s an if In flush-queue that I wish were a cond.

markasoftware · 2022-07-05T16:26:23Z

Thanks for the feedback @ecpeterson.

I've been looking a little bit more into the speed of my implementation. Without a global queue, it does tend to create a greater number of queues than the original implementation, and is typically about 10% slower on circuits involving more than 2 qubits. I'd assume the extra flexibility could lead to better compression in some cases, so this isn't a problem IMO.

However, when increasing *global-queue-tolerance-threshold* to be greater than 3, the speed gets worse. On some circuits I tried, the performance was 50% worse than the original when *global-queue-tolerance-threshold* is set to 4. That's because if we start with size 4 queues, then my implementation creates size 3 queues next, then size 2, and finally size 1. On the other hand, the original implementation would create a size 4 global queue, but then immediately drop down to size 2 queues, then size 1.

I could use some guidance on whether there's any reason to compress with an "intermediate" sized queue (ie, a queue with size strictly between 2 and *global-queue-tolerance-threshold*). For example, I'm skeptical that compression on a size-3 queue will make additional progress on a sequence of instructions that was already compressed as part of a size-4 queue, since no new compilers are introduced until we go down to size 2.

In the examples I've been playing around with, my implementation doesn't compress noticeably better than the original. But maybe there are some situations where it could.

It's very easy for me to replicate the original behavior here if desired and jump straight down to size 2 queues during recursion. In fact, I have already been playing around with this, and it performs almost identically to the original in terms of both output quality and speed.

markasoftware · 2022-07-28T17:02:07Z

I just pushed a change to restrict the recursive queue sizes to 2 as discussed in my previous comment.

IMO the PR is now ready to merge.

ecpeterson · 2022-09-01T18:13:51Z

Sorry for forgetting to reply to your query! I'm not surprised about the speed degradation for "wide" queues, but I also don't think it's worth fretting about. You can jump to 2 or not bother. I think it's exceedingly rare that traversing all the intermediate queue sizes will yield an improvement — even if quilc were capable of optimal nQ compilation, which it is not. On the other hand, jumping to 2 is kind of a weird quirk that may spook some future maintainer. 🤷‍♀️

Merge it either way imo.

stylewarning · 2022-09-07T01:44:40Z

Sorry for forgetting to reply to your query! I'm not surprised about the speed degradation for "wide" queues, but I also don't think it's worth fretting about. You can jump to 2 or not bother. I think it's exceedingly rare that traversing all the intermediate queue sizes will yield an improvement — even if quilc were capable of optimal nQ compilation, which it is not. On the other hand, jumping to 2 is kind of a weird quirk that may spook some future maintainer. 🤷‍♀️

Merge it either way imo.

I'm going to merge this, but @markasoftware could you add a comment to the source code with some of these sentiments so as to not spook future maintainers, and make a separate PR?

Mark Polyakov added 4 commits June 27, 2022 13:51

First shot refactoring COMPRESS-INSTRUCTIONS.

13a2be5

I believe it's sound, but does not always optimize as well -- it does not yet implement some of the repeated passes that the old COMPRESS-INSTRUCTIONS did.

Rewritten COMPRESS-INSTRUCTIONS is working!

d9b9076

IT's not 100% functionally equivalent with the original, so I still need to run some manual tests to evaluate its performance.

stylewarning requested review from stylewarning, braised-babbage and ecpeterson June 30, 2022 17:15

braised-babbage previously approved these changes Jun 30, 2022

View reviewed changes

src/compressor/compressor.lisp Outdated Show resolved Hide resolved

Improve docstring for FLUSH-QUEUE

a0f37ed

markasoftware dismissed braised-babbage’s stale review via a0f37ed July 1, 2022 22:22

ecpeterson previously approved these changes Jul 3, 2022

View reviewed changes

Mark Polyakov added 2 commits July 5, 2022 09:19

Change IF to COND in FLUSH-QUEUE inside COMPRESS-INSTRUCTIONS

b838671

Limit recursion queue size to be <=2 in COMPRESS-INSTRUCTIONS

e910043

markasoftware dismissed ecpeterson’s stale review via b838671 July 5, 2022 16:28

stylewarning merged commit c3c142f into quil-lang:master Sep 7, 2022

stylewarning mentioned this pull request Sep 7, 2022

add de-spooking maintainer comment #843

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor `COMPRESS-INSTRUCTIONS` #830

Refactor `COMPRESS-INSTRUCTIONS` #830

markasoftware commented Jun 30, 2022 •

edited

Loading

markasoftware commented Jun 30, 2022

braised-babbage commented Jun 30, 2022 •

edited

Loading

markasoftware commented Jun 30, 2022

braised-babbage commented Jun 30, 2022

ecpeterson left a comment

markasoftware commented Jul 5, 2022

markasoftware commented Jul 28, 2022

ecpeterson commented Sep 1, 2022 •

edited

Loading

stylewarning commented Sep 7, 2022

Refactor COMPRESS-INSTRUCTIONS #830

Refactor COMPRESS-INSTRUCTIONS #830

Conversation

markasoftware commented Jun 30, 2022 • edited Loading

markasoftware commented Jun 30, 2022

braised-babbage commented Jun 30, 2022 • edited Loading

markasoftware commented Jun 30, 2022

braised-babbage commented Jun 30, 2022

ecpeterson left a comment

Choose a reason for hiding this comment

markasoftware commented Jul 5, 2022

markasoftware commented Jul 28, 2022

ecpeterson commented Sep 1, 2022 • edited Loading

stylewarning commented Sep 7, 2022

Refactor `COMPRESS-INSTRUCTIONS` #830

Refactor `COMPRESS-INSTRUCTIONS` #830

markasoftware commented Jun 30, 2022 •

edited

Loading

braised-babbage commented Jun 30, 2022 •

edited

Loading

ecpeterson commented Sep 1, 2022 •

edited

Loading