RFC: HTML::?Filter ... suggest a name for the beast
Jenda
created: 2006-05-01 09:03:49

Morning lads,
I have a module (what a surprise) I'd like to upload it to CPAN, but ... I can't think of a name.

The module sits on top of HTML::Parser and allows you to filter the tags and attributes in your HTML. You specify what's allowed and everything else it removed.

There is a similar module on CPAN already, CPAN://HTML::TagFilter, the difference between the two is that

  1. HTML::TagFilter has a lot more options and contains some other more or less related features (mangling email addresses and such)
  2. HTML::TagFilter assumes the filtering is controled by the code, while HTML::?Filter accepts the list of tags and attributes as a single string. Making it easier to use if the list is to be maintained by other people than the developers. (I am using the module in an application that needs to filter the HTML it sends to various clients and the allowed tags vary from client to client. The developers are not involved in setting this up though,the admins can specify what to allow themselves.)

I do think the module could come handy to other people as well, but for the life of me I can't think of a reasonable name. (HTML::Filter is already taken.)

Any suggestions are welcome.

Re: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 09:13:27
You said it yourself!!
HTML::TagFilter::Simple
Re^2: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 09:37:46

That sounds as if it depended on or belonged to the HTML::TagFilter. I would not want to stomp on someone elses namespace. So maybe rather "HTML::SimpleTagFilter".

I'd rather if we could find a way to communicate that the module is controled by a string containing the allowed tags&attribs instead of from within the code.

Re^3: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 10:15:52
Foo::Bar::Simple (or Foo::Bar::Lite) is often written by someone other than the author of Foo::Bar. This is a common practice and well accepted by the community. If you have any concerns, ask modules@perl.org and they'll help out.

My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re^4: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 11:42:34

Well, yes, actually one of my modules follows this naming style. Win32::Daemon::Simple provides a simplified (yet at the same time extended) interface to Win32::Daemon. The thing is that HTML::?Filter doesn't use HTML::TagFilter. Then there's CGI::Simple that provides part of the CGI.pm functionality and thus the code of the module is simpler ... this is not the case either. HTML::?Filter is totally independent to HTML::TagFilter. I did not know about HTML::TagFilter (if it even was already on CPAN which I doubt) when writing HTML::?Filter.

Re^5: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-02 10:16:07
CGI::Lite is a more lightweight version of much of what CGI does. Likewise, OLE::Storage_Lite is a lighter alternative to OLE::Storage.

Note that it is "Lite", not "Simple". Simple indeed indicates a form of inheritance, such as HTML::TokeParser::Simple.

Re^3: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 17:25:41
Maybe something in the way your module's working can give someone an idea? What about posting the POD of the module here, to see it it lights a bulb somewhere?
Re^4: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 18:42:19

There's not a lot to document so I think the synopsis will suffice:

use HTML::JFilter;

my $filter = new HTML::JFilter <<'*END*'
b i code pre br
a: href name
font: color size style
*END*

$filteredHTML = $filter->doSTRING($enteredHTML);
$filter->doFILE($inputfile, $outputfile);

I don't like the interface either actually. Maybe the methods should be Filter() and FilterFile(). Or maybe I could provide a functional interface instead or in adition:

my $filter = HTML::JFilter::MakeFilter ($allowed_tags);
$filtered = $filter->( $html);

Re^5: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-02 06:21:12
Yep, I vote for "Filter()" and "FilterFile()" instead... For the module, what do you think of HTML::Sieve as a name ?
Re: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 17:44:30

What about HTML::Parser::Filter. General though if yours works in a similar but simpler fashion as HTML::TagFilter then I would think HTML::TagFilter::Simple. Think of the relation more as what they do, not how they do it.


___________
Eric Hodges
Re: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 18:05:10

How about HTML::Subset or HTML::Subset::Filter?


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 20:55:43
I say you keep calling it HTML::JFilter.

update: or maybe HTML::FilterJ, or HTML::Filterer :)

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.
Re: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 23:24:50
s/::Filter/::Cleaner/ ? or maybe Remover


Re: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-01 23:27:51

HTML::LimitTags?


DWIM is Perl's answer to Gödel
Re: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-02 09:30:59

I do think the module could come handy to other people as well, but for the life of me I can't think of a reasonable name. (HTML::Filter is already taken.)


HTML::TagElide ? "Elide" is a great word that I confess I learned just last year. Now that doesn't sound like a selling point, but think about it.



Re: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-03 14:28:27

Suggestions, I've got in spades:

HTML::Tag::*

  1. ::Filter (might be confusing)
  2. ::Cleaner (cleans tags? maybe not...)
  3. ::Remove (removes tags... accurate!)
  4. ::Sieve (keeps some tags, removes others, like a filter in a way)
  5. ::Restrict (restricts HTML to allowed tags)
  6. ::Replace (replaces Tags with other data)

I've bolded my two favorites. Those are my suggestions. The jury is out on whether they are good suggestions, but you did say that any suggestions were welcome. ;-)

<-radiant.matrix->
A collection of thoughts and links from the minds of geeks
The Code that can be seen is not the true Code
I haven't found a problem yet that can't be solved by a well-placed trebuchet
Re: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-03 23:56:08
I would propose:

HTML::Intersection

I refer to the tags that the user is asking to leave from all the group of tags that form part of the html page in question. It is like a mathematical intersection...

Re: RFC: HTML::?Filter ... suggest a name for the beast
sgt
created: 2006-05-04 17:43:07
HTML::TagFilterEasy
Re: RFC: HTML::?Filter ... suggest a name for the beast
created: 2006-05-12 01:11:58
If I read you module description correctly, HTML::TagFilter isn't the only module that filters tags and attributes. HTML::Scrubber and HTML::Santiser both do this

perlmonks.org content © perlmonks.org and astroboy, bart, BrowserUk, chanio, dragonchild, eric256, Gavin, GrandFather, Intrepid, Jenda, PodMaster, radiantmatrix, sgt, sh1tn, wazoox

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03