<P>'d Off with Formatting Options!

21 posts by 5 authors in: Forums > CMS Builder
Last Post: May 21, 2009   (RSS)

Re: [Dave] <P>'d Off with Formatting Options!

By Djulia - May 13, 2009 - edited: May 13, 2009

Hi,

I have a problem with TinyMCE.
He adds tag <p></p> and I would like to remove them.

1) <p></p>
2) <p> </p>

I do not want to modify the option force_p_newlines (it is very useful !).
I tested : http://www.webmasterworld.com/forum88/7286.htm
But
, it removes only "<p></p>" and "<p> </p>" is not removed.

[font "verdana"]<?php
$html = "<a></a><b>non-empty</b>";
function removeemptytags($html_replace)
{
$pattern = "/<[^\/>]*>([\s]?)*<\/[^>]*>/";
return preg_replace($pattern, '', $html_replace);
}
[/#000000]


[font "verdana"]// Usage:
echo removeemptytags($html);
// Returns '<b>non-empty</b>'
?>
[/#000000]


Does somebody have a suggestion ?


I think that the problem comes from UTF8 (?).
Space must be in UTF8.


Thanks for your assistance.

Djulia

Re: [Djulia] <P>'d Off with Formatting Options!

By Dave - May 13, 2009

Hi Djulia,

Do you have an example of the space character it's not removing? This works for me:

<?php

function removeemptytags($html_replace) {
$pattern = "|<[^/>]*>[\s]*</[^>]*>|s";
return preg_replace($pattern, '', $html_replace);
}

// prints: HelloWorldAgainAgain!
print "<xmp>";
print removeemptytags("Hello<p></p>World<p> </p>Again<p>\n</p>Again!");
print "</xmp>";

?>


I simplified the code a little bit. If you knew what the other character was, such as the hex code for it, you could add it beside the \s (which means whitespace character such as \n or space, etc).

Hope that helps!
Dave Edis - Senior Developer
interactivetools.com

Re: [Dave] <P>'d Off with Formatting Options!

By Djulia - May 14, 2009 - edited: May 14, 2009

Hi Dave, Thanks,

If I use the function directly in the page, there is no problem. But, if I use the function with a field of the table ($record['page']), the function does not remove the tags.

The reason, it is that TinyMCE automatically produced a paragraph with a no-break space (unbreakable) to respect the standard. <p>unbreakable</p>


Here what my UTF8 page gives with Iso-8859-1: Capture3.gif
utf8_decode($record['page'])

I do not know if I explain correctly ?

It seems that many users of TinyMCE encounter this problem.
It is a subject which is discussed much on the forum of Tiny.


While waiting to find a good solution, I found :
$chaine = eregi_replace('<p>([^A-Za-z0-9])*</p>','',$chaine);

It is not best the solution, because it does not make it possible to remove the other tags empty (<h2></h2>, <h3></h3>, ...).

Djulia

Attachments:

capture3_001.gif 2K

Re: [Djulia] <P>'d Off with Formatting Options!

By Dave - May 14, 2009

Hi Djulia,

We just need to identify the strings or characters you want to match and then add them to the list. Try this:

function removeemptytags($html_replace) {
$pattern = "/<[^\/>]*>(\s|&nbsp;|\xFF\xFD)*<\/[^>]*>/si";
return preg_replace($pattern, '', $html_replace);
}

print "<xmp>";
print removeemptytags("Hello<p>Again</p><p> </p>Again<p>\n</p>Again!<p>\xFF\xFD &nBsp; </p><p>one</p><p>two</p><p>three</p>");
print "</xmp>";
exit;


That should remove <p></p> tags that contain spaces, &nbsp; the FF-FD sequence or any combination of those.

Hope that helps, let me know if that works for you.
Dave Edis - Senior Developer
interactivetools.com

Re: [Dave] <P>'d Off with Formatting Options!

By Djulia - May 14, 2009

No, that does not function.

UTF8 to ISO gives me :

<p>Â </p>

There is a discution on this subject here :
http://htmlpurifier.org/phorum/read.php?3,3312,3312

Thanks for your patience.

Djulia

Re: [Dave] <P>'d Off with Formatting Options!

By Djulia - May 14, 2009

Hi Dave,

I perhaps found a solution :
[^\w] (non-word character)

function removeemptytags($html_replace) {
$pattern = "/<[^\/>]*>(\s|&nbsp;|[^\w])*<\/[^>]*>/si";
return preg_replace($pattern, '', $html_replace); }
echo removeemptytags($record['page']);

That functions, but I do not know if that can cause problems with the content.

You have an opinion ?

Thanks, Djulia

Re: [Djulia] <P>'d Off with Formatting Options!

By Dave - May 15, 2009

That should work fine, as long as your content had alphanumeric characters in it such as a-z and 0-9. If it was all utf-8 high-ascii characters though it might be removed, though.

The best method would be to run some tests to determine.

Hope that helps!
Dave Edis - Senior Developer
interactivetools.com

Re: [Dave] <P>'d Off with Formatting Options!

By rconring - May 21, 2009

Dave ...
I need to remove the TRAILING </p> tag in order to concatenate a "Read full article" to the summary without a line feed. I tried adding a slash to the code you gave for removing the leading <p> but it won't work ... obviously not correct syntax.

<!-- remove leading <p>, do this _before_ displaying field value -->
<?php $record['summary'] = preg_replace("/^\s*</p>/i", "", $record['summary'] ); ?>

What would do this for me?

PHP challenged ...
Ron Conring
Conring Automation Services
----------------------------------------
Software for Business and Industry Since 1987

Re: [rconring] <P>'d Off with Formatting Options!

By Dave - May 21, 2009

Hi Rod,

Try this:

$record['summary'] = preg_replace("|</p>\s*$|i", "", $record['summary'] );

That means (match </p>, followed by zero or more spacing characters, followed by the end of the string, and remove it).

Let me know if that works for you.
Dave Edis - Senior Developer
interactivetools.com