Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuring Pandoc #7529

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Allow configuring Pandoc #7529

wants to merge 2 commits into from

Conversation

asankah
Copy link

@asankah asankah commented Aug 1, 2020

Add Pandoc configuration options for those situations where --mathjax alone doesn't cut it.

@CLAassistant
Copy link

CLAassistant commented Aug 1, 2020

CLA assistant check
All committers have signed the CLA.

@asankah
Copy link
Author

asankah commented Aug 1, 2020

Needs some work, but I'd like to get an early review before I spend more time on this.

bibliography := c.cfg.MarkupConfig.Bibliography

if bibliography.Source != "" {
arguments = append(arguments, "--bibliography", bibliography.Source)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the --bibliography flag also needed the --citeproc flag to work?

@AustenLamacraft
Copy link

Since pandoc now has citeproc built in, this is the only thing standing in the way of having citations in Hugo, which would be huge for many academic users.

@igor-krawczuk
Copy link

Has this been abandoned by now or will it be merged any time soon and I should I hold of of doing my own patch of hugo to enable mathml?

@MilkClouds
Copy link

MilkClouds commented Apr 27, 2022

I'm anticipating for this feature. What is going on in this pull request? Will it be merged or there's something problem? If possible, I have intention to contribute to this feature, if there's something problematic.

@asankah
Copy link
Author

asankah commented Apr 27, 2022

@MilkClouds I didn't get any feedback on this PR one way or another. I can clean up my local patch and update the PR, but it would be it would be nice to get some signal on whether we are going down the right path.

@bep
Copy link
Member

bep commented Apr 27, 2022

@asankah yea, sorry about this, I'm not using Pandoc so it has not been on top of my priority list. Earlier, I was more concerned about the security aspects of adding more config options to these external programs (e.g. pandic, asciidoc). Now that we have added a security config that blocks pandoc by default, I'm more open to these additions.

Can I ask a general question before I look through these changes in detail: Do any of these options allow passing arbitrary flags directly to the pandoc binary?

@asankah
Copy link
Author

asankah commented Apr 27, 2022

@bep Currently it does allow passing arbitrary flags, but in practice I've found it to not be necessary. So I'll remove those.

pandoc users tend to go heavy on custom filters which obviously run under the users' identity and can pretty much do anything. So the allowing custom filters tends to make the security properties fuzzy.

Would it be possible to point me at the threat model we are looking at for Hugo? A rough sketch would be enough.

@bep
Copy link
Member

bep commented Apr 27, 2022

@asankah I have written some prose about this here: https://gohugo.io/about/security-model/

The big problem with shelling out to pandoc and asciidoc etc. is that they from a security standpoint becomes black boxes where it not practical have any opinion on the security aspects beyond using the known features as described by the flags that we pass to it.

My main concern about this would be some phishing attacks via a malicious sites (I sometimes help people by downloading sites and building them on my computer) or modules. The pandoc config would only be available to the site config, which is a good thing.

Since pandoc is now blocked by default, I'm sleeping easier at night, but I'm still a little wary.

And my main concern is these unknown flags (all flags of pandoc is unknown to me), e.g. pandoc --compile-and-run-c-program.

If the "arbitrary flag" thing is not vital, I suggest removing it, else we can discuss it.

@MilkClouds
Copy link

@bep Since I do not well know about security, I can't speak with confidence, but I guess there isn't any concerns about merging this PR. Even now I can allow usage of pandoc by appending pandoc to security.allow.exec, like this.

security:
  enableInlineShortcodes: false
  exec:
    allow: ["^dart-sass-embedded$", "^go$", "^npx$", "^postcss$", "^pandoc"]
    osEnv: ["(?i)^(PATH|PATHEXT|APPDATA|TMP|TEMP|TERM)$"]

I think letting anyone who want to use pandoc with arbitrary flag to manually config security config, and alerting them is enough. Is there any specific scenario that arbitary argument may be threating to site? Is it related to malicious theme maker?

@bep
Copy link
Member

bep commented Apr 28, 2022

Is there any specific scenario that arbitary argument may be threating to site? I

This:

pandoc --compile-and-run-c-program

As I said above, I assume pandoc does not have such flag, but then again, I assumed that about log4j as well.

@igor-krawczuk
Copy link

May I suggest making a whitelist for additional flags under security, like currently necessary for pandoc anyway? Or ist the concern a user can be tricked to clone a malicious git repo with a malicious config (including whitelists), run Hugo and so get hacked?

@MilkClouds
Copy link

Then, will it be okay if parameter of pandoc is being registed in environment variables?
malicious theme or config cann't modify our environment variable, as long as I know.

shoeffner added a commit to shoeffner/hugo that referenced this pull request May 30, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request May 30, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request May 30, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request May 30, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jun 4, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jun 4, 2022
Adds the citeproc filter to the pandoc converter if pandoc >= 2.11 is
available.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jun 4, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jun 5, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jun 5, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jun 5, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jun 5, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jun 20, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jul 3, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jul 11, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jul 12, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Jul 26, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
shoeffner added a commit to shoeffner/hugo that referenced this pull request Oct 5, 2022
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
@nealkruis
Copy link

@bep, things are fairly quiet on these Pandoc related PRs (see also #8911). What is needed still to push these across the finish line?

We rely on the flexibility of Pandoc for much of our content generation, but it doesn't do us much good if we can't configure it the way we would like to.

I believe the biggest security concern would be using Pandoc filters, which call an external script. Maybe Hugo can warn a user that a configuration is using filters and they should only proceed if they trust the source of the scripts?

@danmackinlay
Copy link

There may be a "voting with the feet" issue @nealkruis . I imagine many of the people who wished to use pandoc as the markup format in hugo wished to use various academic features, such as citations, mathematical markup etc which are not well-supported natively by hugo markup. Some of the effort that previously went into advocating hugo support for these "academic" pandoc features now goes into implementing missing hugo features in quarto. Quarto extends pandoc and thus includes all the typical academic features as a core feature, plus other ones such as rendering code output inline, including plots, producing PDFs and so on. We may have lost critical mass for pandoc support in hugo (which has been pandoc-resistant) in favour of an alternative which is pro-pandoc (although, of course, missing many hugo features). In particular, I believe there has been migration from the hugo-backed blogdown platform to quarto.

shoeffner added a commit to shoeffner/hugo that referenced this pull request Aug 8, 2023
Adds the citeproc filter to the pandoc converter.

There are several PRs for it this feature already. However, I think
simply adding `--citeproc` is the cleanest way to enable this feature,
with the option to flesh it out later, e.g., in gohugoio#7529.

Some PRs and issues attempt adding more config options to Hugo which
indirectly configure pandoc, but I think simply configuring Pandoc via
Pandoc itself is simpler, as it is already possible with two YAML
blocks -- one for Hugo, and one for Pandoc:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    ...
    Document content with @citation!

There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`,
which works out of the box with this PR:

    ---
    title: This is the Hugo YAML block
    ---
    ---
    bibliography: assets/pandoc-yaml-block-bibliography.bib
    nocite: |
      @*
    ...
    Document content with no citations but a full bibliography:

    ## Bibliography

Other useful options are `csl: ...` and `link-citations: true`, which
set the path to a custom CSL file and create HTML links between the
references and the bibliography.

The following issues and PRs are related:

- Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101
  Bundles multiple requests, this PR tackles citation parsing.

- WIP: Bibliography with Pandoc gohugoio#4800
  Passes the frontmatter to Pandoc and still uses
  `--filter pandoc-citeproc` instead of `--citeproc`.
- Allow configuring Pandoc gohugoio#7529
  That PR is much more extensive and might eventually supersede this PR,
  but I think --bibliography and --citeproc should be independent
  options (--bibliography should be optional and citeproc can always be
  specified).
- Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610
  Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo.
  I think passing --citeproc and letting the users decide on the
  metadata they want to pass to pandoc is better, albeit uglier.
Current options are:

  markup:
    pandoc:
      filters:
        - list
        - of
        - filters
      extensions:
        - list
        - of
        - extensions
      extraArgs:
        - --extra-arguments
        - --one-per-line

Generalize some Pandoc options.

Support configuring a bibliography in markup config

Anonymous Update

[pandoc] Allow page parameters to override site parameters.

This allows specifying things like this in the page frontmatter:

    ---
    title: Something
    bibliography:
      source: mybibliography.bib
    pandoc:
      filter:
        - make-diagrams.lua
    ...

These options are local to the page. Specifying the same under `markup`
in the site configuration applies those settings to all pages.

Paths (filters, bibliography, citation style) are resolved relative to
the page, site, and the `static` folder.

[pandoc] Support metadata

Support specifying Pandoc metadata in the site configuration and page
configuration using the following syntax:

Site (in `config.yaml`):

    ```yaml
    markup:
        pandoc:
                metadata:
                        link-citations: true
    ```

Or in frontmatter:

    ```yaml
    ---
    pandoc:
        metadata:
                link-citations: true
    ...
    ```

[pandoc] Simplify path management.

No need for any fancy path lookup gymnastics. `pandoc`'s
`--resource-path` option does the legwork of locating resources on
multiple directories.

[pandoc] Don't use x != "" to denote failure.
Current options are:

  markup:
    pandoc:
      filters:
        - list
        - of
        - filters
      extensions:
        - list
        - of
        - extensions
      extraArgs:
        - --extra-arguments
        - --one-per-line

Generalize some Pandoc options.

Support configuring a bibliography in markup config

Anonymous Update

[pandoc] Allow page parameters to override site parameters.

This allows specifying things like this in the page frontmatter:

    ---
    title: Something
    bibliography:
      source: mybibliography.bib
    pandoc:
      filter:
        - make-diagrams.lua
    ...

These options are local to the page. Specifying the same under `markup`
in the site configuration applies those settings to all pages.

Paths (filters, bibliography, citation style) are resolved relative to
the page, site, and the `static` folder.

[pandoc] Support metadata

Support specifying Pandoc metadata in the site configuration and page
configuration using the following syntax:

Site (in `config.yaml`):

    ```yaml
    markup:
        pandoc:
                metadata:
                        link-citations: true
    ```

Or in frontmatter:

    ```yaml
    ---
    pandoc:
        metadata:
                link-citations: true
    ...
    ```

[pandoc] Simplify path management.

No need for any fancy path lookup gymnastics. `pandoc`'s
`--resource-path` option does the legwork of locating resources on
multiple directories.

[pandoc] Don't use x != "" to denote failure.
@tytyvillus
Copy link

@bep, things are fairly quiet on these Pandoc related PRs (see also #8911). What is needed still to push these across the finish line?

We rely on the flexibility of Pandoc for much of our content generation, but it doesn't do us much good if we can't configure it the way we would like to.

I believe the biggest security concern would be using Pandoc filters, which call an external script. Maybe Hugo can warn a user that a configuration is using filters and they should only proceed if they trust the source of the scripts?

This feels like a good solution because it works on an informed consent model.

Unfortunately I am tied to hugo because of institutional reasons, and if you want pandoc, move to quarto does not work for me. ^^'

But the way pandoc is currently implemented in hugo seems to me to be unusable; I've spent a few hours searching the documentation + forums and I have yet to find a way to officially specify even setting the -s flag on html documents, so that maths display correctly. (If I'm being blind and stupid and that is already possible in normal hugo, please let me know.)

@jmooring
Copy link
Member

jmooring commented May 1, 2024

@tytyvillus

Pandoc's -s flag is not related to math rendering in any way. We pass the --mathjax flag when rendering Pandoc files, so all you need to do is load MathJax.js on the page. Please open a topic in the forum if you need help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.