Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aggressively converts html entities, not really usable for email templates #22

Open
jvanasco opened this issue Apr 29, 2013 · 9 comments

Comments

@jvanasco
Copy link

First off, thanks. This is a great package + well designed & documented.

I ran into some issues while trying to get this to work with mako templates (which i use to generate email).

mako templates don't work because the BeautifulSoup parser and formatter will kill things like:

<% value = 1 + 1 %>

and

% if loop.index < 6: 
% endif

becomes

% if loop.index &lt; 6: 
% endif

I tried forking and running on BS4, which allows for a 'formatter' option to be used on printing ( overriding your _get_output function ). I couldn't find a formatter that creates reliable or desirable results.

i think the design of this package will make it hard to work on templates for email , so i wanted to suggest a warning in the documents. If I have time, I might try and get this to work later in the week -- but I'm sadly not hopeful.

Anyways, great package.

@rennat
Copy link
Owner

rennat commented Apr 29, 2013

Thanks for the input! I haven't tried to use this on any templating
languages directly. I'm not opposed to supporting template languages
especially if it just means passing the correct beautiful soup arguments.

Pynliner was first used on a django project where we used templates to
render the emails to HTML then ran pynliner on them before sending the
emails out. I think this approach eliminates the need for running pynliner
on templates directly but I'm sure there are valid scenarios for it that I
haven't thought of.
On Apr 29, 2013 12:05 PM, "jvanasco" notifications@github.com wrote:

First off, thanks. This is a great package + well designed & documented.

I ran into some issues while trying to get this to work with mako
templates (which i use to generate email).

mako templates don't work because the BeautifulSoup parser and formatter
will kill things like:

<% value = 1 + 1 %>

and

% if loop.index < 6:
% endif

becomes

% if loop.index < 6:
% endif

I tried forking and running on BS4, which allows for a 'formatter' option
to be used on printing ( overriding your _get_output function ). I
couldn't find a formatter that creates reliable or desirable results.

i think the design of this package will make it hard to work on templates
for email , so i wanted to suggest a warning in the documents. If I have
time, I might try and get this to work later in the week -- but I'm sadly
not hopeful.

Anyways, great package.


Reply to this email directly or view it on GitHubhttps://github.com//issues/22
.

@jvanasco
Copy link
Author

If you're sending out a non-personalized marketing email, it's no big deal. You only need to do this once.

If you're sending out 1000+ of customized transactional emails, this can take quite a bit of CPU and time. on one of my boxes, it average 1.5s to parse and inline an email that is 'nonstandard' and .5s to parse a 'standard' one (via timeit, and only timing css_inlined = pynliner.fromString(html) ).

non-standard = hand drawn html
standard = non-standard , pumped into BS, then printed out ( this way I ensure its exactly what BS likes )

having a .5 - 1.5s hit on every transactional or customized email is way too much of an overhead for our volume. Our current list size would take around an hour of processing time.

@rennat
Copy link
Owner

rennat commented Apr 29, 2013

You make a good case! Supporting template languages would be a great new
feature as well as speeding up the whole process.
On Apr 29, 2013 2:40 PM, "jvanasco" notifications@github.com wrote:

If you're sending out a non-personalized marketing email, it's no big
deal. You only need to do this once.

If you're sending out 1000+ of customized transactional emails, this can
take quite a bit of CPU and time. on one of my boxes, it average 1.5s to
parse and inline an email that is 'nonstandard' and .5s to parse a
'standard' one (via timeit, and only timing css_inlined =
pynliner.fromString(html) ).

non-standard = hand drawn html
standard = non-standard , pumped into BS, then printed out ( this way I
ensure its exactly what BS likes )

having a .5 - 1.5s hit on every transactional or customized email is way
too much of an overhead for our volume. Our current list size would take
around an hour of processing time.


Reply to this email directly or view it on GitHubhttps://github.com//issues/22#issuecomment-17189198
.

@tclancy
Copy link
Contributor

tclancy commented Jun 18, 2013

I think this is also causing a problem I'm seeing where it is converting some of the <> signs inside a conditional comment, which breaks the rendering of buttons for Outlook if you use the HTML provided by http://emailbtn.net/ turning them into things like this:

<!--[if mso]&gt;
  &lt;v:roundrect xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w="urn:schemas-microsoft-com:office:word" href="http://localhost:8000/account/confirm_email/f10c6a42a92d1f16bb8d79d635f58fec56c4b7fb/" style="height:40px;v-text-anchor:middle;width:200px;" arcsize="8%" stroke="f" fillcolor="#da2929"&gt;
    &lt;w:anchorlock /&gt;
    &lt;center&gt;
  &lt;![endif]-->

Is this something BeautifulSoup is doing or is it the package? It looks like BS had problems with conditional comments in the past but there appear to be some workarounds. If you let me know where to look, I can take a crack at this as the package has proved very helpful.

@rennat
Copy link
Owner

rennat commented Jun 18, 2013

I'll need to do some investigating on where this occurs in the process. It
would be helpful if you commit some test cases that fail for this reason.

On Tue, Jun 18, 2013 at 10:49 AM, Tom Clancy notifications@github.51.alwrote:

I think this is also causing a problem I'm seeing where it is converting
some of the <> signs inside a conditional comment, which breaks the
rendering of buttons for Outlook if you use the HTML provided by
http://emailbtn.net/ turning them into things like this:

Is this something BeautifulSoup is doing or is it the package? It looks
like BS had problems with conditional comments in the pasthttps://groups.google.com/forum/?fromgroups#!topic/beautifulsoup/eMKhD3T2z20but there appear
to be some workaroundshttp://stackoverflow.com/questions/132488/regex-to-remove-conditional-comments.
If you let me know where to look, I can take a crack at this as the package
has proved very helpful.


Reply to this email directly or view it on GitHubhttps://github.com//issues/22#issuecomment-19620504
.

@tclancy
Copy link
Contributor

tclancy commented Jun 18, 2013

Does this work/ do you need additional tests? Not sure if it's too simplistic or overly complicated.

@rennat
Copy link
Owner

rennat commented Jun 18, 2013

Thanks I'll take a look this evening.

On Tue, Jun 18, 2013 at 2:27 PM, Tom Clancy notifications@github.51.alwrote:

Does this work/ do you need additional tests? Not sure if it's too
simplistic or overly complicated.


Reply to this email directly or view it on GitHubhttps://github.com//issues/22#issuecomment-19634816
.

@rennat
Copy link
Owner

rennat commented Jun 20, 2013

@jvanasco I'm still planning a complete rewrite with backwards compatibility on the same interface to be faster and more flexible than relying on beautiful soup but until the rewrite it has to go through beautiful soup which aggressively converts to HTML.

@rennat
Copy link
Owner

rennat commented Nov 20, 2014

TL;DR Work on the rewrite is continuing as I have time. My primary focus is making the conversion process more efficient. My secondary focus is supporting templating languages in an extensible way.


I've recently been working on the rewrite and while trying for a flexible conversion process I've realized that while pynliner 1.0 can (and will) be built to support templating languages and made extensible to allow other templating languages/uses, complete CSS support cannot happen before a template is rendered.

A simple example showing this if some_x = [0,1,2]:

Template

<div>
{{ for x in some_x }}
    <span class="an-x">{{x}}</span>
{{ endfor }}
</div>

Stylesheet

.an-x { color: red; }
.an-x:last-child { color: blue; }

Output if template rendered first (Correct)

<div>
    <span class="an-x" style="color:red">0</span>
    <span class="an-x" style="color:red">1</span>
    <span class="an-x" style="color:blue">2</span>
</div>

Output if applied to template first (Incorrect)

<div>
    <span class="an-x" style="color:blue">0</span>
    <span class="an-x" style="color:blue">1</span>
    <span class="an-x" style="color:blue">2</span>
</div>

The problem here is the :last-child selector which, among other selectors, cannot be tested for correctly until the template is rendered. This does not mean we cannot support processing templates. It does reduce functionality to a subset of CSS.

None of this prevents us from enabling conversion on templates but I will be focusing my effort on making the conversion process more efficient first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants