Fixed bug for 16a4w ptq #12167

rohansjoshi · 2025-07-02T17:32:52Z

Summary: Currently running the script executorch/examples/models/llama/export_llama.py with the flag --ptq 16a4w, it does 16a16w quantization; this diff fixes this. This may be related to some GitHub issues

Differential Revision: D77671468

Summary: Currently running the script executorch/examples/models/llama/export_llama.py with the flag --ptq 16a4w, it does 16a16w quantization; this diff fixes this. This may be related to some GitHub issues Differential Revision: D77671468

pytorch-bot · 2025-07-02T17:32:59Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12167

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 Cancelled Jobs

As of commit 338770d with merge base 967cfae ():

CANCELLED JOBS - The following jobs were cancelled. Please retry:

pull / test-models-linux (mobilebert, portable, linux.2xlarge) / linux-job (gh)
##[error]The operation was canceled.
pull / test-models-linux (mobilebert, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-07-02T17:33:02Z

This pull request was exported from Phabricator. Differential Revision: D77671468

github-actions · 2025-07-02T17:33:38Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Fixed bug for 16a4w ptq

338770d

Summary: Currently running the script executorch/examples/models/llama/export_llama.py with the flag --ptq 16a4w, it does 16a16w quantization; this diff fixes this. This may be related to some GitHub issues Differential Revision: D77671468

rohansjoshi requested review from jackzhxng, larryliu0820, swolchok and mergennachin as code owners July 2, 2025 17:32

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 2, 2025

facebook-github-bot added the fb-exported label Jul 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixed bug for 16a4w ptq #12167

Fixed bug for 16a4w ptq #12167

Uh oh!

rohansjoshi commented Jul 2, 2025

Uh oh!

pytorch-bot bot commented Jul 2, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Jul 2, 2025

Uh oh!

github-actions bot commented Jul 2, 2025

Uh oh!

Uh oh!

Fixed bug for 16a4w ptq #12167

Are you sure you want to change the base?

Fixed bug for 16a4w ptq #12167

Uh oh!

Conversation

rohansjoshi commented Jul 2, 2025

Uh oh!

pytorch-bot bot commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12167

❌ 2 Cancelled Jobs

Uh oh!

facebook-github-bot commented Jul 2, 2025

Uh oh!

github-actions bot commented Jul 2, 2025

This PR needs a release notes: label

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 2, 2025 •

edited

Loading

This PR needs a `release notes:` label