Search the web
Sign In
New User? Sign Up
perl-python
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Show off your group to the world. Share a photo of your group with us.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 69 - 98 of 127   Newest  |  < Newer  |  Older >  |  Oldest
Messages: Show Message Summaries   (Group by Topic) Sort by Date v  
#98 From: xah lee <xah@...>
Date: Fri Sep 2, 2005 6:16 am
Subject: gzip a file
p0lyglut
Online Now Online Now
Send Email Send Email
 
20050830

Here is a example of how to decompress a gzip file using Python.

# -*- coding: utf-8 -*-
# Python

import gzip
inF = gzip.GzipFile("/Users/t/access_log.1.gz", 'rb');
s=inF.read()
inF.close()

outF = file("/Users/t/access_log.1", 'wb');
outF.write(s)
outF.close()


Here is a example of compressing a gzip file using Python.

# -*- coding: utf-8 -*-
# Python

import gzip

inF = file("x.txt", 'rb');
s=inF.read()
inF.close()

outF = gzip.GzipFile("x.txt.gz", 'wb');
outF.write(s)
outF.close()


For more detail, see http://python.org/doc/2.4.1/lib/module-gzip.html

Perl does not come with a gzip module bundled, but one can easily call
the unix gzip with qx. (assuming you are on unix)

# perl

qx(gzip x.txt);
qx(gzip -d x.txt.gz);


Several module are available online to do compression. IO::Zlib,
PerlIO::gzip.

----------
this post is archived at:
http://xahlee.org/perl-python/gzip.html

#97 From: xah lee <xah@...>
Date: Thu Sep 1, 2005 11:10 am
Subject: On Python's Documentation
p0lyglut
Online Now Online Now
Send Email Send Email
 
On Python's Documentation

Xah Lee, 20050831

I'm very sorry to say, that the Python doc is one of the worst possible
in the industry. I'm very sick of Perl and its intentional obfuscation
and juvenile drivel style of its docs. I always wanted to learn Python
as a replacement of Perl, and this year i did. I thought Python is much
better. But now i know, that although the language is better, but its
documentation is effectively worse than Perl's.

• Official Perl doc: http://www.perl.com/pub/v/documentation
• Official Python doc: http://python.org/doc/2.4.1/
The Perl docs, although lousy in the outset because its people immerse
in drivel. It is part of their unix culture. Nevertheless, one thing
Perl doc is not, is that it in particular do not presume a superficial
Computer Science stance. In fact, its culture sneer computer science
establishment. (which caused major harm in the industry) There are
quite a lot things wrong with Perl docs. But at least it is not shy to
use examples, lots of them.

Now, i have thought that Python doc is entirely different. The Python
community in many ways are antithesis of Perl community. Python doesn't
start with a hacking mentality, and i presume its people care a lot
more about correctness and quality. Unfortunately, as i now know, its
doc is even worse than Perl's. Several problems lie at the core:

its technical writing is extremely poor. (likewise Perl)
its technical content clearly shows that the writers can't or didn't
think clearly. (one confused ball; likewise Perl)
its organization exhibits the worst abstruse insensibilities of tech
geekers. (likewise Perl, exemplified by the infamous unix man pages,
but at least Perl/unix has spunk)
its organization and content presentation has a computer science
pretension.
The Computer Science Pretension aspect is the most egregious that does
the most damage to the Python doc. The text became incomprehensible
abstraction sans any example, and impossible to locate desired
functionalities. Much like unix man pages, it requires the reader to
have familiarity with the entire doc to be able to use it fruitfully.

As i have expressed before (see
http://xahlee.org/Periodic_dosage_dir/t2/xlali_skami_cukta.html ), the
python doc has huge number of problems. To remedy them, it needs major
overhaul if not complete rewrite.

Just about the only worthwhile part of the official doc set is the
Tutorial section.

The “Language Reference” section (subtitled “for language lawyers”)
needs to be replaced by human-readible descriptions of Python's
functions. For exapmle, in the style of official Java doc (
http://java.sun.com/j2se/1.4.2/docs/api/index.html). The Library
Reference section and The Global Module Index are all in a very not
useful state. These 3 section are all a incomprehensible blurr.

i haven't read much of the other sections:

• Macintosh Library Modules
    (for language lawyers)
• Extending and Embedding
    (tutorial for C/C++ programmers)
• Python/C API
    (reference for C/C++ programmers)
• Documenting Python
    (information for documentation authors)
• Installing Python Modules
    (information for installers & sys-admins)
• Distributing Python Modules
but all these should probably not be bundled with the official doc set.

I would like to see the Python doc gets a complete rewrite.

First of all, the doc system LaTeX needs to go. (TeX itself is a
OpenSource crime, and its use as Python's doc system is a illustration
of damage. See this unedited rant
http://xahlee.org/Periodic_dosage_dir/t2/TeX_pestilence.html )

Then, the doc needs to be written with a few principles.

to communicate to programers how to use it. (as opposed to being a semi
description of implementation and compiler process, or inline with some
computer sciency model or software engineering metholodogy fad)
write with the goal of effective communication. In writing, avoid
highbrow words, long sentences, and do focus on concision and
precision. In content, avoid philosophical outlook, jargon population,
author masturbation, arcane technicalities, gratuitous cautions, geek
tips, juvenile coolness ... etc.)
document with consideration of programer's tasks to be performed.
document orient with the language's exhibited functionalities, concrete
behaviors. (as opposed to in some milieu of computer sciency model.)
give ample examples.
(for detail, study several of my Python criticisms from the link
mentioned above)

I have not been in the Python community long and have not delved into
it. Is there other documentation that can be a basis of a new Python
doc? The one i know is the Quick Reference by Richard Gruet. (
http://rgruet.free.fr/PQR24/PQR2.4.html ) As a quick reference, it
provides a concrete documentation of Python functionalities, and is a
excellent basis for new documentation. However, being a Quick Reference
it is very terse, consequently needs a lot work if it is to be a full
documentation.

Of course, the other major hurdle in progress (for a new doc) is a
political one. It is, alas, always the biggest problem.

the Python doc wiki at http://pydoc.amk.ca/frame.html is a great idea.
For this to work, there are two things needs to be done:

1. for the official python site Python.org to point to the wiki as THE
official python doc.

2. given a authoritarian guide in the wiki on how to write the doc.
(the guide based on the principles i gave above. Of course, it needs to
be clarified and elaborated with many cases in point.)

Both are equally important. Without (1), the wiki will never be
prominent. Without (2), it will remain a geek drivel. (in this respect,
similar to how wikipedia's texts drift into a form of academic
esoterica whose educational value and readibility are greatly reduced
to the general public.)
----------
This post is archived at:
http://xahlee.org/UnixResource_dir/writ/python_doc.html

   Xah
   xah@...http://xahlee.org/

#96 From: danny staple <orionrobots@...>
Date: Tue Aug 30, 2005 12:11 pm
Subject: Re: Why OpenSource Documentation is of Low Quality
orionrobots@...
Send Email Send Email
 
My opinion on this is that coders dont like writing docs. And open
source is mostly unheard of outside the developer/geek community.
Technical writers are generally only familiar with the MS world, Word,
Publisher, and some are also familiar with Adobe. The open source
projects dont even register on their radar.

The way to get some decent, QA'd documentation on open source is to
put out bounty's (like rentacoder.com) to get them sorted. When a few
companies have put out a bounty, then somebody may like to go and
collect it. Most non open-source developers only really bother putting
work into something when money changes hands, so the bounty method is
the only way forward in that instance - it then makes it worth
someones time to contribute. The question then is who needs the
documents that much that they want to sponsor a project on it? After
all - having decent documents (or for that matter well thought out
UI's) means that their TCO would go down, as they would no longer need
to hire specialists in such short supply for certain operations with
open source tools.

Orion
--
http://orionrobots.co.uk - Build Robots

On 30/08/05, xah lee <xah@...> wrote:
>  Why OpenSource Documentation is of Low Quality
>
>  Xah Lee, 200508
>
>  previously i've made serious criticisms on Python's documentations
>  problems. (see
> http://xahlee.org/perl-python/re-write_notes.html )
>
>  I have indicated that a exemplary documentation is Wolfram Research
>  Incorporated's Mathematica language. (available online at
>  http://documents.wolfram.com/mathematica/ )
>
>  Since Mathematica is a proprietary language costing over a thousand
>  dollars and most people in the IT industry are not familiar with it, i
>  like to announce a new discovery:
>
>  this week i happened to read the documentation of Microsoft's
>  JavaScript. See
>  http://msdn.microsoft.com/library/en-us/script56/html/
>  js56jsconjscriptfundamentals.asp
>
>  This entire documentary is a paragon of technical writing. It has
>  clarity, conciseness, and precision. It does not abuse jargons, it
>  doesn't ramble, it doesn't exhibit author masturbation, and it covers
>  its area extremely well and complete. The documentation set are very
>  well organized into 3 sections: Fundamentals, Advanced, Reference. The
>  tutorial section "fundamentals" is extremely simple and to the point.
>  The "advanced" section gives a very concise yet easy to read on some
>  fine details of the language. And its language reference section is
>  complete and exact.
>
>  This is not the only good documentation in the industry. As i have
>  indicated, Mathematica documentation is equally excellent. In fact, the
>  official Java documentation (so-called Java API by Sun Microsystems
>  http://java.sun.com/j2se/1.4.2/docs/api/index.html ) is
> also extremely
>  well-written, even though that Java the language is unnecessarily very
>  complex and involves far more technical concepts that necessitate use
>  of proper jargons as can be seen in their doc.
>
>  In general the fundamental reason that Perl, Python, Unix, Apache etc
>  documentations are extremely bad in multiple aspects is because of
>  OpenSource fanaticism. The fanaticism has made it that OpenSource
>  people simply became UNABLE to discern quality. This situation can be
>  seen in the responses of criticisms of OpenSource docs. What made the
>  situation worse is the OpenSource's mantra of "contribution" — holding
>  hostility any negative criticism unless the critic "contributed"
>  without charge.
>
>  Another important point is that the OpenSourcers tend to attribute
>  "lack of resources" as a excuse for their lack of quality. (when they
>  are kicked hard to finally admit that they do lack quality in the first
>  place) No, it is not lack of resources that made the OpenSource doc
>  criminally incompetent. OpenSource has created tools that take far more
>  energy and time than writing manuals. Lack of resource of course CAN be
>  a contributing reason, along with OpenSource coder's general lack of
>  ability to write well, among other reasons, but the main cause as i
>  have stated above, is OpenSource fanaticism. It is that which have made
>  them blind.
>
>  PS just to note, that my use of OpenSource here does not include Free
>  Software Foundation's Gnu's Not Unix project. GNU project in general
>  has reasonably good documentation. GNU docs are geeky in comparison to
>  the commercial entity's docs, but do not exhibit jargon abuse,
>  rambling, author masturbation, or hodgepodge as do the OpenSource ones
>  mentioned above.
>
>    ☄
>
>    ☄
>
>    ☄
>
>
>  ________________________________
>  YAHOO! GROUPS LINKS
>
>
>  Visit your group "perl-python" on the web.
>
>  To unsubscribe from this group, send an email to:
>  perl-python-unsubscribe@yahoogroups.com
>
>  Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
>
>  ________________________________
>

#95 From: xah lee <xah@...>
Date: Tue Aug 30, 2005 2:21 am
Subject: Why OpenSource Documentation is of Low Quality
p0lyglut
Online Now Online Now
Send Email Send Email
 
Why OpenSource Documentation is of Low Quality

Xah Lee, 200508

previously i've made serious criticisms on Python's documentations
problems. (see http://xahlee.org/perl-python/re-write_notes.html )

I have indicated that a exemplary documentation is Wolfram Research
Incorporated's Mathematica language. (available online at
http://documents.wolfram.com/mathematica/ )

Since Mathematica is a proprietary language costing over a thousand
dollars and most people in the IT industry are not familiar with it, i
like to announce a new discovery:

this week i happened to read the documentation of Microsoft's
JavaScript. See
http://msdn.microsoft.com/library/en-us/script56/html/
js56jsconjscriptfundamentals.asp

This entire documentary is a paragon of technical writing. It has
clarity, conciseness, and precision. It does not abuse jargons, it
doesn't ramble, it doesn't exhibit author masturbation, and it covers
its area extremely well and complete. The documentation set are very
well organized into 3 sections: Fundamentals, Advanced, Reference. The
tutorial section “fundamentals” is extremely simple and to the point.
The “advanced” section gives a very concise yet easy to read on some
fine details of the language. And its language reference section is
complete and exact.

This is not the only good documentation in the industry. As i have
indicated, Mathematica documentation is equally excellent. In fact, the
official Java documentation (so-called Java API by Sun Microsystems
http://java.sun.com/j2se/1.4.2/docs/api/index.html ) is also extremely
well-written, even though that Java the language is unnecessarily very
complex and involves far more technical concepts that necessitate use
of proper jargons as can be seen in their doc.

In general the fundamental reason that Perl, Python, Unix, Apache etc
documentations are extremely bad in multiple aspects is because of
OpenSource fanaticism. The fanaticism has made it that OpenSource
people simply became UNABLE to discern quality. This situation can be
seen in the responses of criticisms of OpenSource docs. What made the
situation worse is the OpenSource's mantra of “contribution” — holding
hostility any negative criticism unless the critic “contributed”
without charge.

Another important point is that the OpenSourcers tend to attribute
“lack of resources” as a excuse for their lack of quality. (when they
are kicked hard to finally admit that they do lack quality in the first
place) No, it is not lack of resources that made the OpenSource doc
criminally incompetent. OpenSource has created tools that take far more
energy and time than writing manuals. Lack of resource of course CAN be
a contributing reason, along with OpenSource coder's general lack of
ability to write well, among other reasons, but the main cause as i
have stated above, is OpenSource fanaticism. It is that which have made
them blind.

PS just to note, that my use of OpenSource here does not include Free
Software Foundation's Gnu's Not Unix project. GNU project in general
has reasonably good documentation. GNU docs are geeky in comparison to
the commercial entity's docs, but do not exhibit jargon abuse,
rambling, author masturbation, or hodgepodge as do the OpenSource ones
mentioned above.

   ☄

   ☄

   ☄

#94 From: "Xah Lee" <xah@...>
Date: Tue Jun 21, 2005 10:39 am
Subject: tree functions exercise: Table
p0lyglut
Online Now Online Now
Send Email Send Email
 
The Perl solution is posted. It is a bit long. See
http://xahlee.org/tree/Table.html

the code is pasted below. It uses several auxiliary functions.



#! perl

# http://xahlee.org/tree/tree.html
# Xah Lee, 2005-05


#_____ Function _____ _____ _____ _____

=pod

B<Function>

Function(parameterList,'expressionString') returns a function. The
function takes parameters in parameterList and has body
expressionString. parameterList is a reference to a list of strings,
each represents a parameter name. For example: ['i','j']. The return
value of Function is a reference to a function.

Example:

&{Function(['i','j'],'i + j')}(3,4); # returns 7.

=cut

sub Function ($$) {
	 my @parameterList = @{$_[0]};
	 my $expression = $_[1];

	 my $parameterDeclarationString = '(';

	 foreach my $parameterString (@parameterList) {
	 my $variable = '$' . $parameterString;
	 $expression =~ s($parameterString)($variable)g;
	 $parameterDeclarationString .= q($) . $parameterString . q(,);
	 };

	 chop($parameterDeclarationString);
	 $parameterDeclarationString = q(my ) . $parameterDeclarationString . ')' . q(=
	 @_;);

	 return eval("sub {$parameterDeclarationString; return ($expression);}");
};

#end Function

#_____ Range _____ _____ _____ _____

push (@EXPORT, q(Range));
push (@EXPORT_OK, q(Range));

# this version is written by tilt...@... (Jay Tilton) ,Date: Sun, 15 May
2005 23:33:54 GM
sub Range {
     my( $a1, $b1, $dx ) =
         @_ == 1 ? (    1, $_[0], 1) :
         @_ == 2 ? ($_[0], $_[1], 1) :
         @_;
     if( $dx == 0 ) {
         warn "Range: increment cannot be zero.";
         return;
     }
     return [map $a1 + $dx * $_, 0..int( ($b1 - $a1) / $dx )];
};

#_____ UniqueString _____ _____ _____ _____

=pod

B<UniqueString>

UniqueString('aString', n) returns a reference to a list of n elements of
strings, none of which
are in the given string and no two are identical. Each string starts with 'unik'
and followed by
digits.

Example:

  UniqueString('Something about love...', 2);
  # returns ['unik23946', 'unik14135'].

=cut

# implementation note:
# Dependent functions: (none).

push (@EXPORT, q(UniqueString));
push (@EXPORT_OK, q(UniqueString));

sub UniqueString ($$) {
	 my $input = $_[0];
	 my $n = $_[1];

	 my $str = 'unik' . int(rand()*10000*$n);
	 my @result = ();
	 for (my $i = 0; $i < $n; $i++) {
	 while ($input =~ m($str)) {$str = 'unik' . int(rand()*10000*$n);}
	 $input .= $str;
	 push (@result, $str);
	 };
	 return \@result;
};

#end UniqueString


# _rangeSequence($ref_iteratorList) returns a sequence of ranges. For example,
_rangeSequence([[-5,10,6],[3],[7,10]]) returns [[-5, 1, 7], [1, 2, 3], [7, 8, 9,
10]].
# Dependent functions: Range.
sub _rangeSequence ($) {
my $ref_iteratorList = $_[0];

my @result;
foreach my $ref_iterator (@$ref_iteratorList) {push(@result,
Range(@$ref_iterator))};
return \@result;
};


#_____ Table _____ _____ _____ _____

# implementation note:
# gist: Generate a nested foreach loop, then evaluate this loop to get the
result.

# First, get some basic info:

# @parameterList is of the form: ('i','j',...). If non exists, then a unique
string is inserted. For
example, if input is Table('expr',['i',4],[3]), then @parameterList is
('i','unik293');
# @iteratorList is of the form ([1,3],[2],[-2,7,3],...). It is the iterators
without the dummy
variable (if exists).
# $ref_rangeSequence is of the form [[1,...3],[1,...2],...]. It is the iterators
expanded.

# Now, generate $stringToBeEvaluated. It has the following form (sample):

#foreach $h (0 .. scalar(@{$ref_rangeSequence->[0]}) -1 ) {
#foreach $unik5926 (0 .. scalar(@{$ref_rangeSequence->[1]}) -1 ) {
#foreach $gg (0 .. scalar(@{$ref_rangeSequence->[2]}) -1 ) {
#$resultArray[$h][$unik5926][$gg] = &{Function(\@parameterList,$exprString)}
#($ref_rangeSequence->[0]->[$h],$ref_rangeSequence->[1]-
>[$unik5926],$ref_rangeSequence->[2]->[$gg],);
#};};};

# Dependent functions: UniqueString, _rangeSequence, Function.

sub Table ($;@) {
my $exprString = shift(@_);
my @iteratorList = @_;

my $depth = scalar(@iteratorList);
my @parameterList = ();

# set @parameterList and @iteratorList.
foreach my $ref_iterator (@iteratorList) {
if (scalar(@$ref_iterator) == 1) { push(@parameterList,
${UniqueString($exprString,1)}[0]);}
else { push(@parameterList, shift(@$ref_iterator));};
};

# Now, @parameterList is of the form ('i','j',...).
# Now, @iteratorList is of the form ([1,3],[2],[-2,7,3],...).

# $ref_rangeSequence is of the form [[1,...3],[1,...2],...].
my $ref_rangeSequence = _rangeSequence(\@iteratorList);

my $stringToBeEvaluated;
# generate a declaration of all the symbols. e.g. 'my ($i,$j,...); my
@resultArray';
$stringToBeEvaluated .= 'my (';
foreach my $variable (@parameterList) {$stringToBeEvaluated .= '$' . $variable .
','};
$stringToBeEvaluated .= "); \n";
$stringToBeEvaluated .= 'my @resultArray;' . "\n\n";

#generate the beginning of for loops, $depth number of times. e.g. 'for $i
(1..10) {'
	 for my $i (0 .. $depth-1) {
	 $stringToBeEvaluated .= 'foreach $' . $parameterList[$i] .
	 ' (0 .. scalar(@{$ref_rangeSequence->[' . $i . ']}) -1 ) {' . qq(\n);
	 };

#generate the heart of the loop. e.g. $array[$i][$j]... = f($i,$j,...);
	 $stringToBeEvaluated .= '$resultArray';
	 foreach my $variable (@parameterList) {$stringToBeEvaluated .= '[$' . $variable
. ']';};
	 $stringToBeEvaluated .= ' = &{Function(\@parameterList,$exprString)} (';
	 for (my $i=0; $i<$depth; $i++) {$stringToBeEvaluated .= 
'$ref_rangeSequence->[' . $i
.
	 ']->[$' . $parameterList[$i] . '],'
	 };
	 $stringToBeEvaluated .= "); \n";

#generate the ending of loops, $depth number of times. e.g. '};'
	 $stringToBeEvaluated .= '};' x $depth . "\n\n";
	 $stringToBeEvaluated .=  'return \@resultArray;';

# debugging lines:
# print qq(\$exprString is: $exprString\n);
# print "\@parameterList is: @parameterList\n";
# foreach my $ref_iterator (@iteratorList) {print "@\$ref_iterator is:
@$ref_iterator\n"};
# dumpValue($ref_rangeSequence);
	 print "\n\n-----------------\n\n$stringToBeEvaluated------------\n";

#evaluate the stringToBeEvaluated to obtain the array. Return the result.
eval($stringToBeEvaluated);
};

#end Table


###########################################
# testing

use Data::Dumper;

print Dumper( Table('"f[i,j,k]"', ['i',3], ['j',3], ['k',3]) );

print 5;

#93 From: "Xah Lee" <xah@...>
Date: Sun Jun 19, 2005 10:06 am
Subject: exercise: Table function
p0lyglut
Online Now Online Now
Send Email Send Email
 
Here's the next tree functions exercise in Python, Perl, Java.

The problem is to write a function named Table.

Wolfram Research's Table function documentation is at:
http://documents.wolfram.com/
mathematica/functions/Table (this project is not affliated with Wolfram Research
Incorporated.)

The Perl version's documenation is below:
---------------------

  Table('exprString', [iMax]) generates a list of iMax copies of value of
     eval('exprString'), and returns the refence to the list. i.e.
     [eval('exprString'),eval('exprString'),...]

     Table('exprString', ['i', iMax]) generates a list of the values by
     evaluating 'exprString' when 'i' in the string runs from 1 to iMax.

     Table('exprString', ['i', iMin, iMax]) starts with 'i' = iMin.

     Table('exprString', ['i', iMin, iMax, iStep]) uses steps iStep. If iStep
     is negative, then the role of iMin and iMax are reversed. Inputs such as
     [1, -3 , 1] returns bad result.

     Table('exprString', ['i', iMin, iMax, iStep], ['j', jMin, jMax, iStep],
     ... ) gives a array by iterating 'i', 'j' in 'exprString'. For example,
     Table('f(i,j)', ['i',1,3], ['j',5,6]) returns [[f(1, 5), f(1, 6)], [f(2,
     5), f(2, 6)], [f(3, 5), f(3, 6)]].

     In general, Table has the form Table('expressionString', iterator1,
     iterator2, ...) where 'expressionString' is a string that will be
     evaluated by eval. iterator have one of the following forms [iMax],
     ['dummyVarString',iMax], ['dummyVarString',iMin, iMax], or
     ['dummyVarString',iMin, iMax, iStep].

     If Table fails, 0 is returned. Table can fail, for example, when the
     argument are not appropriate references or the iterator range is bad
     such as ['i',5,1].

     Example:

      Table('q(s)' ,[3]); # returns ['s','s','s']

      Table( 'i**2' , ['i', 4]); # returns [1, 4, 9, 16]

      Table('[i,j,k]',['i',2],['j',100,200,100],['k',5,6])
      # returns [[[[1,100,5],[1,100,6]],[[1,200,5],[1,200,6]]],
      #          [[[2,100,5],[2,100,6]],[[2,200,5],[2,200,6]]]]

----------------

The first argument of Table function in Mathematica (mma) is a expression. Most
other
languages cannot have such symbolic expressions. In Perl, a string is choosen
instead as the
experssion, and it is being evalutade later as code. This may not be a practical
choice but
anyway it's just a exercise. Each other language should choose appropriate
design for this
emulation...

Perl, Python, Java solutions will be posted by me in the coming days.

The URL for this post and all future update is at:
http://xahlee.org/tree/Table.html

  Xah
  xah@...
  http://xahlee.org/

#92 From: "Xah Lee" <xah@...>
Date: Sun Jun 12, 2005 10:12 pm
Subject: Range function
p0lyglut
Online Now Online Now
Send Email Send Email
 
the belated Java solution to previous Range exercise.



import java.util.List;
import java.util.ArrayList;
import java.lang.Math;

class math {
     public static List range(double n) {
         return range(1,n,1);
     }

     public static List range(double n, double m) {
         return range(n,m,1);
     }

     public static List range(double iMin, double iMax, double iStep) {
         List ll = new ArrayList();
         if (iMin <= iMax && iStep > 0) {
             for (int i=0; i <= Math.floor((iMax-iMin)/iStep); i++) {
                 ll.add(new Double(iMin+i*iStep));
             }
             return ll;
         }
         if (iMin >= iMax && iStep < 0) {
             for (int i=0; i <= Math.floor((iMin-iMax)/-iStep); i++) {
                 ll.add(new Double(iMin+i*iStep));
             }
             return ll;
         }
         // need to raise exception here
         return ll;
     }
}

class Range {
     public static void main(String[] arg) {
         System.out.println(math.range(5));
         System.out.println(math.range(5,10));
         System.out.println(math.range(5,7, 0.3));
         System.out.println(math.range(5,-4, -2));
     }
}

Perl & Python solutions archived at:
http://xahlee.org/tree/tree.html

  Xah
  xah@...http://xahlee.org/

#91 From: "Xah Lee" <xah@...>
Date: Tue May 31, 2005 10:10 am
Subject: What are OOP's Jargons and Complexities
p0lyglut
Online Now Online Now
Send Email Send Email
 
The following is a new section of the article
“What are OOP's Jargons and Complexities”
at
http://xahlee.org/Periodic_dosage_dir/t2/oop.html

---------
The Rise of “Inheritance”

In well-thought-out languages, functions can have inner functions, as well as
taking other
functions as input and return function as output. Here are some examples:
subroutine generatePower(n) {
   return subroutine (x) {return x^n};
}

In the above example, the subroutine generatePower returns a function, which
takes a
argument and raise it to nth power. It can be used like this:
print generatePower(2)(5)  // prints 25

Example: fixedPoint:
subroutine fixedPoint(f,x) {
   temp=f(x);
   while (f(x) != temp) {
     temp=f(temp);
   }
   return temp;
}

In the above example, fixedPoint takes two arguments f and x, where f is taken
to be a
function. It applies f to x, and apply f to that result, and apply f to that
result again, and
again, until the result is the same. That is to say, it computes
f[f[f[...f[x]...]]]. FixedPoint is a
math notion. For example, it can be employeed to implement Newton's Method of
finding
solutions as well as many problems involving iteration or recursion. FixedPoint
may have a
optional third parameter of a true/false function fixedPoint(func,arg,predicate)
for
determining when the nesting should stop. In this form, it is equivalent to the
“while loop” in
procedural languages.

Example: composition:
subroutine composition(a,b,c,...) {
   return subroutine {a(b(...c...))};
}

The above example is the math concept of function composition. That is to say,
if we apply
two functions in sequence as in g[f[x]], then we can think of it as one single
function that is a
composition of f and g. In math notation, it is often denoted as (g∘f). For
example, g[f[x]]→y
is the same as (g∘f)[x]→y. In our pseudo-code, the function composition
takes any number of
arguments, and returns a single function of their composition.

When we define a subroutine, for example:
subroutine f(n) {return n*n}

the function is power of two, but the function is named f. Note here that a
function and its
name are two different concepts. In well-thought-out languages, the definition
of a function
and the naming of a function are not made inseparable. In such languages, they
often have a
keyword “lambda” that is used to define functions. Then, one can assign it a
name if one so
wishes. This separation of concepts made many of the lingustic power in the
above examples
possible. Example:
lambda (n) {return n^2;}    \\ a function
(lambda (n) {return n^2;})(5)    \\ a function applied to 5.
f = lambda (n) {return n^2;}    \\ a function is defined and named
f(5)                            \\ a function applied to 5.
lambda (g) {return lambda {g(f)} }     \\ a function composition of (g∘f).


The above facilities may seem exotic to a average programer, but it is in this
milieu of
linguistic qualities the object oriented paradigm arose, where it employees
facilities of inner
function (method), assigning function to variable (instantiation), function
taking function as
inputs (calling method thru object), and application of function to argument
(applying
method to instance variable).

The data-bundled-with-functions paradigm finds fitting application to some
problems. With
the advent of such Objet-Oriented practice, certain new ideas emerged. One of
great
consequences is the idea of inheritance.

In OOP practice computation are centered around data as entities of
self-contained boxed
sets (objects). Thus, frequently one needs slightly different boxed sets than
previously
defined. Copy and Pasting existing code to define new boxed sets quickly made it
unmanageable. (a messy set of classes). With powerful lingustic evironment and
habituation,
one began to write these new boxed-subroutines (classes) by extending old
subroutines
(classes) in such a way that the new subroutine contains all variables and
subroutines of a
base subroutine without any of the old code appearing in the body of the
subroutine. Here is
a pseudo-code illustration:
g = subroutine extend(f) {
   new variables ...
   new inner-subroutines ...
   return a subroutine that also contains all stuff in subroutine f
}

Here, “extend” is a function that takes another function f, and returns a
new function such
that this new function contains all the boxed-set things in f, but added its
own. This new
boxed-set subroutine is given a name g.

In OOP parlance, this is the birth of inheritance. Here, g inherited from that
of f. f is called
the base class or superclass of g. g is the derived class or subclass of f.

In functional terms, inheritance mechanism is a function E that takes another
function f as
input and returns a new function g as output, such that g contained all enclosed
members of
f with new ones defined in E. In pure OOP languages such as Java, the function E
is exhibited
as a keyword “extends”. For example, the above code would be in Java:
class g extend f {
   new variables ...
   new inner-subroutines ...
}

Here is the same example in Python, where inheritance takes the form of a class
definition
with a parameter:
class g(f):
   new variables ...
   new inner-subroutines ...


Data is the quintessence in computation. Because in OOP all data are embodied in
classes,
and wrapping a class to each and every variety of data is unmanageable,
inheritance became
the central means to manage data.
---------

  Xah
  xah@...http://xahlee.org/

#90 From: xah lee <xah@...>
Date: Thu May 26, 2005 12:59 am
Subject: Re: What are OOP's Jargons and Complexities
p0lyglut
Online Now Online Now
Send Email Send Email
 
The following is a new section of the article
“What are OOP's Jargons and Complexities”
at
http://xahlee.org/Periodic_dosage_dir/t2/oop.html

---------
the Rise of “Access Specifiers” (or, the Scoping Complexity of OOP)

In programing, a variable has a scope — meaning where the variable can
be seen. Normally, there are two basic models: dynamically scoped and
lexically scoped. Dynamic scoping is basically a time based system,
while lexical scoping is text based (like “what you see is what you
get”). For example, consider the following code:
subroutine f() {return y}
{y=3; print f()}

In dynamic scoping, the printed result is 3, because during evaluation
of the block all values of y is set to 3. In lexical scoping, “y” is
printed because any y in the block is set to 3 before f is called. With
regards to language implementation, Dynamic Scoping is the no-brainer
of the two, and is the model used in earlier languages. Most of the
time, lexical scoping is more natural and desired.

Scoping is also applicable to subroutines. That is to say, where
subroutines can be seen. A subroutine's scope is usually at the level
of source file (or a concept of a module/package/library), because
subroutines are often used in the top level of a source file, as
opposed to inside a code block like variables.

In general, the complexity of scoping is really just how deeply nested
a name appears. For example see in the following code:
name1;     // top level names. Usually subroutines.
{
    name2    // second level names. Usually variables inside subroutines.
    {
      name3  // deeper level names. Less often used in structured
programing.
    }
}

If a programing language uses only one single file of commands in
sequence as in the early languages such as BASIC, there would be no
scoping concept. The whole program is of one single scope.

OOP has created a immense scoping complexity because its mode of
computing is calling nested subroutines (methods) inside classes
(subroutines). We detail some aspects in the following.

In OOP, variables inside subroutines (class variables) can also be
accessed thru a reference the subroutine is assigned to (that is, a
object). In OOP parlance: a variable in a class has a scope, while the
same variable when the class is instantiated (a objet) is a different
scoping issue. In other words, OOP created a new entity “variable thru
reference” that comes with its own scoping issue. For example:
class a_surface() {
    coordinates={...};               // a variable
}

class main() {
    mySurface = new a_surface();
    mySurface.coordinates = {...};   // the same variable
}

In the above code, the variable “coordinates” appears in two places.
Once as defined inside a_surface, and once as a instantiated version of
a_surface, that is, a object. The variable as thru the object reference
apparently has a entirely different scoping issue than the same
variable inside the subroutine (class) definition. The question for OOP
language designers is: what should the scope be for variables referred
thru objects? Within the class the object is created? within the class
the variable is defined? globally? (and what about inherited classes?
(covered later))

As we've seen, methods are just inner-subroutines, and creating objects
to call methods is OOP's paradigm. In this way, names at the
second-level programing structure often associate with variables (and
inner-subroutines), is now brought to the forefront. This is to say,
the scoping of subroutines are raised to a level of complexity as the
scoping of variables. (they both being “class members”.)

All in all, the scoping complexities of OOP as applied to different OOP
entities (class variables, class's methods, object variables and
methods) is manifested as access specifiers in Java. In Java, access
specifiers are keywords “private”, “protected”, “public”, used to
declare the scope of a entity. Together with a default scope of
no-declaration, they create 4 types of scope, and have entirely
different effects when used upon a variable, a method, a constructor,
and a class.

See this tutorial of Java's access specifiers for technicality:
http://xahlee.org/java-a-day/access_specifiers.html
---------

   Xah
   xah@...http://xahlee.org/

#89 From: xah lee <xah@...>
Date: Tue May 24, 2005 6:28 pm
Subject: Re: What are OOP's Jargons and Complexities
p0lyglut
Online Now Online Now
Send Email Send Email
 
The following is a new section of the article
“What are OOP's Jargons and Complexities”
at
http://xahlee.org/Periodic_dosage_dir/t2/oop.html

---------
The Rise of “Constructors” and “Accessors”

A instantiation, is when a variable is assigned a super-subroutine
(class). A variable assigned such a super-subroutine is now called a
instance of a class or a object.

In OOP practice, certain inner-subroutines (methods) have developed
into specialized purposes. A inner-subroutine that is always called
when the super-subroutine is assigned to a variable (instantiation), is
called a constructor or initializer. These specialized
inner-subroutines are sometimes given a special status in the language.
For example in Java the language, constructors are different from
methods.

In OOP, it has developed into a practice that in general the data
inside super-subroutines are supposed to be changed only by the
super-subroutine's inner-subroutines, as opposed to by reference thru
the super-subroutine. (In OOP parlance: class's variables are supposed
to be accessed/changed only by the class's methods.) Though this
practice is not universal or absolute. Inner-subroutines that change or
return the value of variables are called accessors. For example, in
Java, a string class's method length() is a accessor.

Because constructors are usually treated as a special method at the
language level, its concept and linguistic issues is a OOP machinery
complexity, while the Accessor concept is a OOP engineering complexity.
------------

   Xah
   xah@...http://xahlee.org/

#88 From: xah lee <xah@...>
Date: Mon May 23, 2005 7:39 pm
Subject: What are OOP's Jargons and Complexities
p0lyglut
Online Now Online Now
Send Email Send Email
 
The following is a new section of the article
“What are OOP's Jargons and Complexities”
at
http://xahlee.org/Periodic_dosage_dir/t2/oop.html

---------
The Rise of “Static” versus “Instance” variables

In a normal programing language, variables inside functions are used by
the function, called local variables.

In OOP paradigm, as we've seen, super-subroutines (classes) are
assigned to variables (instantiation), and the inner-subroutines
(methods) are called thru the variables (objects). Because of this
mechanism, what's once known as local variables (class variables) can
now also be accessed thru the assigned variable (objet). In OOP
parlance, this is to say that a class's variables can be accessed thru
the object reference, such as in myObject.data=4. For example:
mySurface = new a_surface();
mySurface.coordinatesList={...} // assign initial coordinates

However, sometimes a programmer only needs a collection of variables.
For exmple, a list of colors:
black = "#000000";
gray = "#808080";
green = "#008000";

In pure OOP, data as these now come with a subroutine (class) wrapper:
class listOfColors() {
    black = "#000000";
    gray = "#808080";
    green = "#008000";
}

Now to access these values, normally one needs to assign this
subroutine (class) to a variable (instantiation) as to create a object:
myColors = new listOfColors(); // instantiation! (creating a "object")
newColor = myColors.black;

As a workaround of this cumbersomeness is born the concept of “static”
variables. (with the keyword “static” in Java) When a variable is
declared static, that variable can be accessed without needing to
instantiate its class. Example:
class listOfColors() {
    static black = "#000000";
    static gray = "#808080";
    static green = "#008000";
}
newColor = listOfColors.black;   // no instantiation required

The issue of staticality is also applicable to inner-subroutines
(methods). For example, if you are writing a collection of math
functions such as Sine, Cosine, Tangent... etc, you don't really want
to create a instance in order to use. Example:
class mathFunctions() {
    static sin (x) {...};         // a static method
    ...
}
class main() {
    print mathFunctions.sin(1);  // no need to create object before use
}

The non-static variant of variables and methods are called “instance
variables” or “instance methods”, or collectively “instance members”.
Note that static members and instance members are very different. With
static members, variables and methods can be called without creating a
object. But more subtly, for a static variable, there is just one copy
of the variable; for instance variables, each object maintains its own
copy of the variable. A class can declare just some variables static.
So, when multiple objects are created from the class, some variables
will share values while others having independent copies. For example:
class a_surface() {
    static pi;                     // a static variable
    coordinatesList;               // a instance variable
    ...
};
a_surface.pi=3.1415926;          // assign value of pi for all
a_surface objects
mySurface1 = new a_surface();
mySurface1.coordinatesList={...} // assign coordinates to one a_surface
object
mySurface2 = new a_surface();
mySurface2.coordinatesList={...} // assign coordinates to another
a_surface object

The issues of static versus instance members, is one complexity arising
out of OOP.
------------

   Xah
   xah@...http://xahlee.org/

#87 From: "Xah Lee" <xah@...>
Date: Sun May 15, 2005 4:31 pm
Subject: a Range function
p0lyglut
Online Now Online Now
Send Email Send Email
 
the previous posted solutions are badly botched.

Here's a better solution. Any further correction will appear on the website
instead. (http://
xahlee.org/tree/tree.html)

Similar change needs to be made for the Perl code... Java code will come
tomorror.

By the way, the code from me are not expected to be exemplary. These are
learning
experiences for all, also as a intro to functional programing to industry
programers. Also,
later on there will be non-trivial problems.

# -*- coding: utf-8 -*-
# Python

# http://xahlee.org/tree/tree.html
# Xah Lee, 2005-05

import math;

def Range(iMin, iMax=None, iStep=None):
   if (iMax==None and iStep==None):
     return Range(1,iMin)
   if iStep==None:
     return Range(iMin,iMax,1)
   if iMin <= iMax and iStep > 0:
     if (isinstance(iStep,int) or isinstance(iStep,long)):
       return range( iMin, iMax+1, iStep)
     else:
       result=[]
       for i in range(int(math.floor((iMax-iMin)/iStep))+1):
         result.append( iMin+i*iStep)
       return result
   if iMin >= iMax and iStep < 0:
     if (isinstance(iStep,int) or isinstance(iStep,long)):
       return range( iMin, iMax-1, iStep)
     else:
       result=[]
       for i in range(int(math.floor((iMin-iMax)/-iStep))+1):
         result.append( iMin+i*iStep)
       return result
   # raise error about bad argument. To be added later.

# test
print Range(5,-4,-2)

# Thanks to Peter Hansen for a correction.

  Xah
  xah@...http://xahlee.org/

#86 From: "Xah Lee" <xah@...>
Date: Sun May 15, 2005 10:06 am
Subject: a Range function
p0lyglut
Online Now Online Now
Send Email Send Email
 
Here's the Perl solution to the Range problem.

----------
# perl

# http://xahlee.org/tree/tree.html
# Xah Lee, 2005-05

#_____ Range _____ _____ _____ _____

=pod

B<Range>

Range($iMax) generates the list [1, 2, ... , $iMax].

Range($iMin, $iMax) generates the list [$iMin, ... , $iMax].

Range($iMin, $iMax, $iStep) uses increment $iStep, with the last element in the
result being
less or equal to $iMax. $iStep cannot be 0. If $iStep is negative, then the role
of $iMin and
$iMax are reversed.

If Range fails, 0 is returned.

Example:

  Range(5); # returns [1,2,3,4,5]

  Range(5,10); # returns [5,6,7,8,9,10]

  Range( 5, 7, 0.3); # returns [5, 5.3, 5.6, 5.9, 6.2, 6.5, 6.8]

  Range( 5, -4, -2); # returns [5,3,1,-1,-3]

=cut

sub Range ($;$$) {
if (scalar @_ == 1) {return _rangeFullArgsWithErrorCheck(1,$_[0],1);};
if (scalar @_ == 2) {return _rangeFullArgsWithErrorCheck($_[0],$_[1],1);};
if (scalar @_ == 3) {return _rangeFullArgsWithErrorCheck($_[0],$_[1],$_[2]);};
};

sub _rangeFullArgsWithErrorCheck ($$$) {
my ($a1, $b1, $dx) = @_;

if ($dx == 0) {print "Range: increment cannot be zero."; return 0}
elsif ($a1 == $b1) {return [$a1];}
elsif ( ((($b1 - $a1) > 0) && ($dx < 0)) || ((($b1 - $a1) < 0) && ($dx > 0)) )
{print "Range: bad
arguments. You have [$a1,$b1,$dx]"; return 0;}
elsif ((($a1 < $b1) && ($b1 < ($a1 + $dx))) || (($a1 > $b1) && ($b1 > ($a1 +
$dx)))) {return
[$a1];}
else { return _rangeWithGoodArgs ($a1,$b1,$dx);};
};

sub _rangeWithGoodArgs ($$$) {
my ($a1, $b1, $dx) = @_;
my @result;

if ($a1 < $b1) {for (my $i = $a1; $i <= $b1; $i += $dx) { push (@result, $i);};
}
else {for (my $i = $a1; $i >= $b1; $i += $dx) { push (@result, $i);}; };
return \@result;
};

#end Range

##########
# test

use Data::Dumper;
print Dumper(Range(5,7,0.3));

#85 From: "Xah Lee" <xah@...>
Date: Sun May 15, 2005 10:05 am
Subject: a Range function
p0lyglut
Online Now Online Now
Send Email Send Email
 
Here's the Python solution to the Range problem.

----------
# -*- coding: utf-8 -*-
# Python

# http://xahlee.org/tree/tree.html
# Xah Lee, 2005-05

# implementation note: When iStep is a decimal, rounding error
# accumulates. For example, the last item returned from
# Range(0,18,0.3) is 17.7 not 18. A remedy is to turn iStep into a
# fraction and do exact arithmetics, and possibly convert the result
# back to decimal. A lesser workaround is to split the interval as to
# do multiple smaller range and join them together.

def Range(iMin, iMax=None, iStep=None):
   if (iMax==None and iStep==None):
     return Range(1,iMin)
   if iStep==None:
     return Range(iMin,iMax,1)
   if iMin <= iMax and iStep > 0:
     if (isinstance(iStep,int) or isinstance(iStep,long)):
       return range( iMix, iMax, iStep)
     else:
       result=[];temp=iStep
       while iMin <= iMax:
         result.append(iMin)
         iMin += iStep
       return result

# test
print Range(0, 18, 0.3)

#84 From: xah lee <xah@...>
Date: Fri May 13, 2005 12:56 am
Subject: a Range function
p0lyglut
Online Now Online Now
Send Email Send Email
 
Today we'll be writing a function called Range. The Perl documentation
is as follows.

Perl & Python & Java Solutions will be posted in 48 hours.


Perl-Python a-day at:
http://xahlee.org/web/perl-python/python.html

   Xah
   xah@...http://xahlee.org/

--------------------------

Range

      Range($iMax) generates the list [1, 2, ... , $iMax].

      Range($iMin, $iMax) generates the list [$iMin, ... , $iMax].

      Range($iMin, $iMax, $iStep) uses increment $iStep, with the last
element
      in the result being less or equal to $iMax. $iStep cannot be 0. If
      $iStep is negative, then the role of $iMin and $iMax are reversed.

      If Range fails, 0 is returned.

      Example:

       Range(5); # returns [1,2,3,4,5]

       Range(5,10); # returns [5,6,7,8,9,10]

       Range( 5, 7, 0.3); # returns [5, 5.3, 5.6, 5.9, 6.2, 6.5, 6.8]

       Range( 5, -4, -2); # returns [5,3,1,-1,-3]

   ☄

#83 From: "Xah Lee" <xah@...>
Date: Mon May 9, 2005 6:55 am
Subject: Python doc problems: unnecessary use of jargons
p0lyglut
Online Now Online Now
Send Email Send Email
 
Here is a illustration of the Info Tech industry's need for inane formality and
spurious
jargons.

The official Python doc on regex syntax (
http://python.org/doc/2.4/lib/re-syntax.html )
says:

-----
"|"

  A|B, where A and B can be arbitrary REs, creates a regular expression that will
match either A
or B. An arbitrary number of REs can be separated by the "|" in this way. This
can be used
inside groups (see below) as well. As the target string is scanned, REs
separated by "|" are
tried from left to right. When one pattern completely matches, that branch is
accepted. This
means that once A matches, B will not be tested further, even if it would
produce a longer
overall match. In other words, the "|" operator is never greedy. To match a
literal "|", use \|,
or enclose it inside a character class, as in [|].
-----

Note: “In other words, the "|" operator is never greedy.”

Note the need to inject the high-brow jargon “greedy” here as a latch on
sentence.

“never greedy”? What is greedy anyway?

“Greedy”, when used in the context of computing, describes a certain
characteristics of
algorithms. When a algorithm for a minimizing/maximizing problem is such that,
whenever it
faced a choice it simply choses the shortest path, without considering whether
that choice
actually results in a optimal solution.

The rub is that such stratedgy will often not obtain optimal result in most
problems. If you
go from New York to San Francisco and always choose the road most directly
facing your
destination, you'll never get on.

For a algorithm to be greedy, it is implied that it faces choices. In the case
of alternatives in
regex "regex1|regex2|regex3", there is really no selection involved, but
following a given
sequence.

What the writer were thinking when he latched on about greediness, is that the
result may
not be from the pattern that matches the most substring, therefore it is not
“greedy”. It's not
greedy Python docer's ass.

Such unnecessary jargon throwing, as found everywhere in tech docs, is a
significant reason
why the computing industry is filled with shams the likes of unix, Perl,
Programing Patterns,
eXtreme Programing, “Universal” Modeling Language.

Here is a version that is simpler, clearer, to the point:
The vertical bar is used to express alternatives in regex. For example,
r'regex1|regex2|regex3' will match any of the regexes, starting from left to
right. For
example, if regex2 is found in the target string, regex3 will not be tried even
if the pattern is
also in the target string and match more substring than regex2. Alternatives can
be used
inside capture groups as well (see Captures below).
To match the vertical bar | exactly, use \|.

See How to Write a Tutorial ( http://xahlee.org/Periodic_dosage_dir/t2/
xlali_skami_cukta.html ), which documents and analyzes more of doc writer's
inane need to
use jargons.

See also: Responsible Software Licensing: (
http://xahlee.org/UnixResource_dir/writ/
responsible_license.html ).

#82 From: xah lee <xah@...>
Date: Sat May 7, 2005 5:25 am
Subject: Python doc problems (continued)
p0lyglut
Online Now Online Now
Send Email Send Email
 
HTML Problems in Python Doc

I don't know what kind of system is used to generate the Python docs,
but it is quite unpleasant to work with manually, as there are
egregious errors and inconsistencies.

For example, on the “Module Contents” page (
http://python.org/doc/2.4/lib/node111.html ), the closing tags for <dd>
are never used, and all the tags are in lower case. However, on the
regex syntax page ( http://python.org/doc/2.4/lib/re-syntax.html ), the
closing tages for <dd> are given, and all tages are in caps.

The doc's first lines declare a type of:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

yet in the files they uses "/>" to close image tags, which is a XHTML
syntax.

the doc litters <p> and never closes them, making it a illegal
XML/XHTML by breaking the minimal requirement of well-formedness.

Asides from correctness, the code is quite bloated as in generally true
of generated HTML. For example, it is littered with: <tt id='l2h-853'
xml:id='l2h-853'> which isn't used in the style sheet, and i don't
think those ids can serve any purpose other than in style sheet.

Although the doc uses a huge style sheet and almost every tag comes
with a class or id attribute, but it also profusively uses hard-coded
style tags like <b>, <big> and Netcsape's <nobr>.

It also abuse tables that effectively does nothing. Here's a typical
line:
<table cellpadding="0" cellspacing="0"><tr valign="baseline">
    <td><nobr><b><tt id='l2h-851' xml:id='l2h-851'
class="function">compile</tt></b>(</nobr></td>
    <td><var>pattern</var><big>[</big><var>,
flags</var><big>]</big><var></var>)</td></tr></table>


If Python is supposed to be a quality language, then its
documentation's content and code seems indicate otherwise.
-----------------------

This email is archived at:
http://xahlee.org/perl-python/re-write_notes.html

   Xah
   xah@...http://xahlee.org/

#81 From: xah lee <xah@...>
Date: Fri May 6, 2005 10:18 am
Subject: Notes on rewriting the Python documentation
p0lyglut
Online Now Online Now
Send Email Send Email
 
Notes on rewriting the Python documentation

Xah Lee, 200505.

In 2005, i started to learn Python by reading its official
documentation. In the process, i find the doc's quality really bad. I
have written several essays regarding its problems, collected as “How
to write a tutorial” at:
http://xahlee.org/Periodic_dosage_dir/t2/xlali_skami_cukta.html.
Subsequently, i've undertaken the task of completely rewriting the doc
for Python's RE module. See
http://xahlee.org/python_re-write/lib/module-re.html. What follows
below are some notes on this rewrite experience.

Remove Command Line Interface Look & Feel

In the doc, examples are often given in Python command line interface
format. For example:
  >>> def f(n):
...     return n+1
...
  >>> f(1)
2

instead of:
def f(n):
    return n+1
print f(1)   # returns 2

the clean format should be used because it does not require familiarity
with Python command line, it is more readable, and the code can be
copied and run readily.

A significant portion of Python doc's readers, if not majority, didn't
come to Python as beginning programers, and or one way or another never
used or cared about the Python command line interface.

Suppose a non-Python programer is casually shown a page of Python doc.
She will get much more from the clean example than the version
cluttered with Python Command line interface irrelevancies.

Suppose now we have a experienced professional Python programer. She
will also find examples in plain code immediately readable and
familiar, than the version plastered with Python Command line interface
irrelevancies.

The only place where the Python command line look-and-feel is
appropriate is in the Python tutorial, and arguably only in the
beginning sections.

Extra point: If the Python command line interface is actually a robust
application, like so-called IDE e.g. Mathematica front-end, then things
are very different. In reality, the Python command line interface is a
toy whose max use is as a simplest calculator and double as a chanting
novelty for ignorant coders. In practice it isn't even suitable as a
trial'n'error pad for real-world programing.

Extra point: do not use the jargon “interpreter”. 90% of its use in the
doc should be deleted. They should be replaced with “software”,
“program”, “command line interface”, or “language” or others.

(I dare say that 50% of all uses of the word interpreter in computer
language contexts are inane. Fathering large amounts of
misunderstanding and confusion.)

Move Irrelevant Histories to one place

History of Python are littered all over the doc. e.g. “Incompatibility
note: in the original Python 1.5 release, maxsplit was ignored. This
has been fixed in later releases.”

99% of programers are not concerned with the history of a language.
Inevitably software including languages change over time, however
conservative one tries to be. So, move all these changes into a “New
and Incompatible changes” page at some appendix of the lang spec. This
way, people who are maintaining older code, can find their info and in
one coherent place. While, the general programers are not forced to
wade thru the details of the past in every few paragraphs. (few
exceptions can be made: when the change is of major importance that all
practicing Python coders really must be informed regardless whether
they maintain old code.)

Organize by Functionality, not Implementation

Do not take a attitude like you have to stick to some artificial format
or “correctness” in the doc. Remember, the doc's prime goal is to
communicate to programers how a language functions, not how it is
implemented or how technically or computer scientifically speaking.

In writing a language documentation, there is a question of how to
organize it. This is a issue of design, and it takes thinking.

When a doc writer is faced with such a challenge, the easiest route is
a no-brainer by following the way the language is implemented. For
example, the doc will start with the language's “data types”. This
no-brainer stupidity is unfortunately how most language docs are
organized by, and the Python doc is one of the worst.

One can see this phenomenon in the official doc of Python's RE module.
For example, it begin with Regex Syntax, then it follows with “Module
contents”, then Regex Objects, then Match Objects. And in each page,
the functions or methods are arranged in a alphabetical order. This is
typical of the no-brainers organization following how the module is
implemented or certain artificial arrangement. It has remote connection
to how the module is used to perform a task.

In general, language docs should be organize by the tasks it is
supposed to accomplish, and or by functionalities. In other words, from
the point of view of the language's users (i.e. programers), not the
language's compiler.

For example, the RE module doc, organize it by the purposes of the
module. To begin, we explain in the outset that this module is for the
purpose of search or replacing a string by a pattern. Then, we organize
with purpose and functionalities as guide.

Since Python provides a set of functions and a Object-Oriented set, we
create a page for each, and with a clear indication on how they relates
to the string pattern search/replace task. Since Python returns the
result as a special Object, we again create a section MatchObject and
clearly tells the reader what that page is about in relation to the
task. And, we also put the regex syntax on its own page, but again made
it clear what this page means in relation to the task. And in each
page, we again organize them by the guide of tasks and functionalities.
In this way, the whole RE module doc is oriented to programing, not how
this module happens to be classified according to some Python
language's idiosyncrasies.

The complete rewritten doc is here:
http://xahlee.org/perl-python/python_re-write/lib/module-re.html

   Xah Lee

   ☄

#80 From: Xah Lee <xah@...>
Date: Tue Apr 19, 2005 5:57 pm
Subject: new documentatino of Python's RE Module
p0lyglut
Online Now Online Now
Send Email Send Email
 
long story short, here's a complete rewrite of Python's re module doc

http://xahlee.org/perl-python/python_re-write/lib/module-re.html

This documentation should be far more readible than the original. It
took me about 8 hours.

long story short... i thought unix/perl is bad and Python's is different
because Pythoners didn't start with a moronic and sloppy attitude. To my
chagrin, the Python docs is quite lousy, or literally: not much better.
See also:

How to Write a Computer Tutorial
http://xahlee.org/Periodic_dosage_dir/t2/xlali_skami_cukta.html

which documents some of my experiences with the Python doc.

Xah
xah@...http://xahlee.org/

#79 From: xah lee <xah@...>
Date: Sun Apr 17, 2005 8:17 am
Subject: Re: split a line by regex
p0lyglut
Online Now Online Now
Send Email Send Email
 
Unicode chars can be included in regex patterns directly. Just make
sure your string starts with ur. For example:
re.search(ur'苦',mystring,re.U). Unicode can also be represented by “\u”
followed by its hexadecimal code. For example, to match the unicode “em
space”, which has hexadecimal 2003, do
re.search(ur'\u2003',mystring,re.U). (em space can also be included
literally as well)


   Xah
   xah@...http://xahlee.org/

-------
On Apr 16, 2005, at 4:28 AM, Xah Lee wrote:



20050415 split a line by regex

Here's another example of using regex.

I have a file that is translation of Chinese lyrics. It is formatted
like this:

你是我最苦澀的等待   |   you are my hardest wait
讓我歡喜又害怕未來   |   giving me joy and also fear the future

Where the left side is Chinese, the right side English. I want to write
a program to split the line, so that i get the whole Chinese part or the
whole English part.

Here's the code:

# -*- coding: utf-8 -*-
# Python

import re

filePath='/Users/t/web/Periodic_dosage_dir/sanga_pemci/x.txt'

inF = open(filePath,'rb')
s=unicode(inF.read(),'utf-8')
inF.close()

lns=re.split(r'\n',s)

for ln in lns:
      fracture=re.split(r'\s*\|\s*',ln,re.U)
      print fracture[0].encode('utf-8')


------------

The official Python doc for regex is here, very lousy in quality:
http://python.org/doc/2.4.1/lib/module-re.html


http://xahlee.org/perl-python/regex_split.html

#78 From: Xah Lee <xah@...>
Date: Sat Apr 16, 2005 11:28 am
Subject: split a line by regex
p0lyglut
Online Now Online Now
Send Email Send Email
 
20050415 split a line by regex

Here's another example of using regex.

I have a file that is translation of Chinese lyrics. It is formatted
like this:

你是我最苦澀的等待   |   you are my hardest wait
讓我歡喜又害怕未來   |   giving me joy and also fear the future

Where the left side is Chinese, the right side English. I want to write
a program to split the line, so that i get the whole Chinese part or the
whole English part.

Here's the code:

# -*- coding: utf-8 -*-
# Python

import re

filePath='/Users/t/web/Periodic_dosage_dir/sanga_pemci/x.txt'

inF = open(filePath,'rb')
s=unicode(inF.read(),'utf-8')
inF.close()

lns=re.split(r'\n',s)

for ln in lns:
     fracture=re.split(r'\s*\|\s*',ln,re.U)
     print fracture[0].encode('utf-8')


------------

The official Python doc for regex is here, very lousy in quality:
http://python.org/doc/2.4.1/lib/module-re.html


http://xahlee.org/perl-python/regex_split.html

#77 From: xah lee <xah@...>
Date: Tue Apr 12, 2005 10:06 am
Subject: find & replace a sequence of regex pairs
p0lyglut
Online Now Online Now
Send Email Send Email
 
i've updated the previous find & replace code.

in the following page
http://xahlee.org/perl-python/findreplace_regex.html

is a code that does find & replace of strings for all html files in a
dir, by a list of pairs of regex string patterns.

   Xah Lee


   ☄

#76 From: "Sean Gugler" <sgugler@...>
Date: Wed Mar 30, 2005 8:05 pm
Subject: RE: Re: 20050322 sorting a matrix
drguzler
Offline Offline
Send Email Send Email
 
The 'reversed' function was introduced with Python 2.4.
Older versions may instead use:
      directives.reverse()
      for (column, stringQ, directionQ) in directives:


The sorting logic is best illustrated with an example.
Given a matrix
         ['a', 99]
         ['b', 77]
         ['a', 77]
and two sorting directives [1,0,0] [0,0,0] ...

if we sort by column 1:
         ['b', 77]
         ['a', 77]
         ['a', 99]
and then sort the entire matrix again by column 0:
         ['a', 77]
         ['a', 99]
         ['b', 77]

we get different results than if we sorted first by column 0:
         ['a', 99]
         ['a', 77]
         ['b', 77]
and then by column 1:
         ['a', 77]
         ['b', 77]
         ['a', 99]

The latter case, processing the directives in reverse order,
produces the desired result of "sorted by column 1 and sub-sorted
by column 0 where column 1 is the same".

- Sean


-----Original Message-----
From: xah lee [mailto:xah@...]
Sent: Tuesday, March 29, 2005 11:23 PM
To: perl-python@yahoogroups.com
Subject: Re: [perl-python] Re: 20050322 sorting a matrix



I get:
    NameError: global name 'reversed' is not defined

> +  By iterating the directives in reversed order, we ensure
>    that directive[0] is the most significant.  The influence
>    of directive[1] will be seen only where [0] compares
>    identically, and likewise for subsequent directives.

? I don't quite follow the logic...

   Xah


--------------------------------------------
On Mar 28, 2005, at 11:14 AM, Sean Gugler wrote:



According to the Python documentation* for 'sort', supplying
a function for generating keys generally runs faster than supplying
a function for comparing elements.  Thus, we may gain some
performance for large matrices by writing the function as:

def sort_matrix (matrix, directives):
      for (column, stringQ, directionQ) in reversed(directives):
          if stringQ:
              def keyfunc (elt): return str(elt[column])
          else:
              def keyfunc (elt): return     elt[column]
          matrix.sort (None, keyfunc, not directionQ)
      return matrix

Other differences worth mentioning:

+  I find it more readable to assign names to tuples than
     to index them with [n] notation.

+  By iterating the directives in reversed order, we ensure
     that directive[0] is the most significant.  The influence
     of directive[1] will be seen only where [0] compares
     identically, and likewise for subsequent directives.

- Sean

* _Python Library Reference_, "2.3.6.4 Mutable Sequence Types", note (8)


-----Original Message-----
From: Xah Lee [mailto:xah@...]
Sent: Sunday, March 27, 2005 8:21 PM
To: perl-python@yahoogroups.com
Subject: [perl-python] Re: 20050322 sorting a matrix

[...]

------------------------------------------
Python code

# python v 2.4

def sort_matrix(matrix, directives):
      result=matrix
      for dir in directives:
          if dir[1]:
              if dir[2]:
                  result.sort(lambda x,y: cmp( str(x[dir[0]]),
str(y[dir[0]])) )
              else:
                  result.sort(lambda x,y: cmp( str(x[dir[0]]),
str(y[dir[0]])), None, True)
          else:
              if dir[2]:
                  result.sort(lambda x,y: cmp(float(x[dir[0]]),
float(y[dir[0]])) )
              else:
                  result.sort(lambda x,y: cmp(float(x[dir[0]]),
float(y[dir[0]])), None, True )
      return result

[...]






Yahoo! Groups Links








   ☄



Yahoo! Groups Links

#75 From: xah lee <xah@...>
Date: Wed Mar 30, 2005 7:23 am
Subject: Re: Re: 20050322 sorting a matrix
p0lyglut
Online Now Online Now
Send Email Send Email
 
I get:
    NameError: global name 'reversed' is not defined

> +  By iterating the directives in reversed order, we ensure
>    that directive[0] is the most significant.  The influence
>    of directive[1] will be seen only where [0] compares
>    identically, and likewise for subsequent directives.

? I don't quite follow the logic...

   Xah


--------------------------------------------
On Mar 28, 2005, at 11:14 AM, Sean Gugler wrote:



According to the Python documentation* for 'sort', supplying
a function for generating keys generally runs faster than supplying
a function for comparing elements.  Thus, we may gain some
performance for large matrices by writing the function as:

def sort_matrix (matrix, directives):
      for (column, stringQ, directionQ) in reversed(directives):
          if stringQ:
              def keyfunc (elt): return str(elt[column])
          else:
              def keyfunc (elt): return     elt[column]
          matrix.sort (None, keyfunc, not directionQ)
      return matrix

Other differences worth mentioning:

+  I find it more readable to assign names to tuples than
     to index them with [n] notation.

+  By iterating the directives in reversed order, we ensure
     that directive[0] is the most significant.  The influence
     of directive[1] will be seen only where [0] compares
     identically, and likewise for subsequent directives.

- Sean

* _Python Library Reference_, "2.3.6.4 Mutable Sequence Types", note (8)


-----Original Message-----
From: Xah Lee [mailto:xah@...]
Sent: Sunday, March 27, 2005 8:21 PM
To: perl-python@yahoogroups.com
Subject: [perl-python] Re: 20050322 sorting a matrix

[...]

------------------------------------------
Python code

# python v 2.4

def sort_matrix(matrix, directives):
      result=matrix
      for dir in directives:
          if dir[1]:
              if dir[2]:
                  result.sort(lambda x,y: cmp( str(x[dir[0]]),
str(y[dir[0]])) )
              else:
                  result.sort(lambda x,y: cmp( str(x[dir[0]]),
str(y[dir[0]])), None, True)
          else:
              if dir[2]:
                  result.sort(lambda x,y: cmp(float(x[dir[0]]),
float(y[dir[0]])) )
              else:
                  result.sort(lambda x,y: cmp(float(x[dir[0]]),
float(y[dir[0]])), None, True )
      return result

[...]






Yahoo! Groups Links








   ☄

#74 From: "Sean Gugler" <sgugler@...>
Date: Mon Mar 28, 2005 7:14 pm
Subject: RE: Re: 20050322 sorting a matrix
drguzler
Offline Offline
Send Email Send Email
 
According to the Python documentation* for 'sort', supplying
a function for generating keys generally runs faster than supplying
a function for comparing elements.  Thus, we may gain some
performance for large matrices by writing the function as:

def sort_matrix (matrix, directives):
     for (column, stringQ, directionQ) in reversed(directives):
         if stringQ:
             def keyfunc (elt): return str(elt[column])
         else:
             def keyfunc (elt): return     elt[column]
         matrix.sort (None, keyfunc, not directionQ)
     return matrix

Other differences worth mentioning:

+  I find it more readable to assign names to tuples than
    to index them with [n] notation.

+  By iterating the directives in reversed order, we ensure
    that directive[0] is the most significant.  The influence
    of directive[1] will be seen only where [0] compares
    identically, and likewise for subsequent directives.

- Sean

* _Python Library Reference_, "2.3.6.4 Mutable Sequence Types", note (8)


-----Original Message-----
From: Xah Lee [mailto:xah@...]
Sent: Sunday, March 27, 2005 8:21 PM
To: perl-python@yahoogroups.com
Subject: [perl-python] Re: 20050322 sorting a matrix

[...]

------------------------------------------
Python code

# python v 2.4

def sort_matrix(matrix, directives):
     result=matrix
     for dir in directives:
         if dir[1]:
             if dir[2]:
                 result.sort(lambda x,y: cmp( str(x[dir[0]]), str(y[dir[0]])) )
             else:
                 result.sort(lambda x,y: cmp( str(x[dir[0]]), str(y[dir[0]])),
None, True)
         else:
             if dir[2]:
                 result.sort(lambda x,y: cmp(float(x[dir[0]]), float(y[dir[0]]))
)
             else:
                 result.sort(lambda x,y: cmp(float(x[dir[0]]), float(y[dir[0]])),
None, True )
     return result

[...]

#73 From: "Xah Lee" <xah@...>
Date: Mon Mar 28, 2005 4:20 am
Subject: Re: 20050322 sorting a matrix
p0lyglut
Online Now Online Now
Send Email Send Email
 
Here's the solution to previous post.

-------------------------------
perl code:

sub sort_matrix($$) {
     my $ref_matrix = $_[0];
     my @indexMatrix = @{$_[1]};

     my @indexes = map {$_->[0]} @indexMatrix;
     my @operators = map {$_->[1] ? ' cmp ' : ' <=> '} @indexMatrix;
     my @directions = map {$_->[2]} @indexMatrix;

     my $body_code = '';
     my @body_array;
     for (my $i = 0; $i <= $#indexes; $i++) {
         if ($directions[$i]) {
             push(@body_array, "(\$a->[$i]" . $operators[$i]  .
"\$b->[$i])");
         } else {
             push(@body_array, "(\$b->[$i]" . $operators[$i]  .
"\$a->[$i])");
         };
     };
     $body_code = join( ' or ', @body_array);

     my $array_code = '(map { [' . join(q(, ), map {"\$_->[$_]"}
@indexes) . ', $_]} @$ref_matrix)';

     my $code = "map {\$_->[-1]} (sort { $body_code} $array_code)";
     my @result = eval $code;
     return [@result];
};

------------------------------------------
Python code

# python v 2.4

def sort_matrix(matrix, directives):
     result=matrix
     for dir in directives:
         if dir[1]:
             if dir[2]:
                 result.sort(lambda x,y: cmp( str(x[dir[0]]),
str(y[dir[0]])) )
             else:
                 result.sort(lambda x,y: cmp( str(x[dir[0]]),
str(y[dir[0]])), None, True)
         else:
             if dir[2]:
                 result.sort(lambda x,y: cmp(float(x[dir[0]]),
float(y[dir[0]])) )
             else:
                 result.sort(lambda x,y: cmp(float(x[dir[0]]),
float(y[dir[0]])), None, True )
     return result

m = [
    [3, 99, 'a'],
    [2, 77, 'a'],
    [1, 77, 'a']
  ]

print sort_matrix(m,[
[2,True,True],
[1,False,True]
])

The Python code has not been tested much.

http://xahlee.org/perl-python/sort_matrix.html

----------------------------------------------------
On Mar 22, 2005, at 9:03 AM, xah lee wrote:

Today we'll write a program that can sort a matrix in all possible
ways.

Here's the Perl documentation. I'll post a Perl and Python version in 2
days.

-----------

sort_matrix( $matrix, [[$n1, $stringQ, $directionQ], [$n2, $stringQ,
$directionQ], ...]) sorts a matrix by $n1 th column then $n2 th...and
so on.

$matrix must be a reference to references of arrays, having the form
[[$e1, $e2,...], [...], ...].  $stringQ is a boolean indicating
whether to treat corresponding columns as a strings instead of as
number in the sorting process. True means string. $directionQ is a
boolean indicating ascending sort or not for the correpsonding
column. In the column spec $n1 $n2 ..., index counting starts at 0.

   Example:

   my $ref_matrix =
   [
     [3, 99, 'a'],
     [2, 77, 'a'],
     [1, 77, 'a']
   ];

sort_matrix( $ref_matrix,  [ [2,1,1], [1,0,1] ]);
# this means sort by third column, regarding it as strings,
# and in ascending order. If tie, sort by second column,
# regarding it as number, in ascending order.

# returns [[2,77,'a'],[1,77,'a'],[3,99,'a']];

------------------

Note: in the above, ignore the "must be a reference to references of
arrays". That's technical point, because Perl the language do nested
lists thru workaround of "references".

http://xahlee.org/perl-python/sort_matrix.html

#72 From: xah lee <xah@...>
Date: Fri Mar 25, 2005 4:35 am
Subject: a critique of Python documentation
p0lyglut
Online Now Online Now
Send Email Send Email
 
The Python doc is relatively lousy, from content organization to the
tech writing quality.

I think i'll just post snippets of my comments as i find them. (and
feel like exposing)

Python doc “2.3.3 Comparisons” at
   http://python.org/doc/2.4/lib/comparisons.html , quote:
   Comparison operations are supported by all objects. They all have the
same priority (which is higher than that of the Boolean operations).
Comparisons can be chained arbitrarily; for example, x < y <= z is
equivalent to x < y and y <= z, except that y is evaluated only once
(but in both cases z is not evaluated at all when x < y is found to be
false).

• Problem: “Comparison operations are supported by all objects.”

This is very vague and ambiguous.

The word “object” has generic English meaning as well might have
specific technical meaning in a language that supports Object-Oriented
programing. In Python, it does not have very pronounced technical
meaning. For example, there's a chapter in Python Library Ref titled
“2. Built-In Objects”, and under it a section “2.1 Built-in Functions”.
Apparently, functions can't possibly be meant as a “object” for
comparisons.

Now suppose we take the object in the sentence to be sensible items as
numbers, lists etc. The clause “supported by all objects” is ambiguous.
What is meant by “supported”?

• Problem: They all have the same priority (which is higher than that
of the Boolean operations).

This sentence is very stupid, in multitude of aspects.

The “priority” referred to here means operator precedence.

It tries to say that the comparison operator has higher syntactical
connectivity than boolean operators. E.g. “False and False==False”
means “False and (False==False)” and not “(False and False)==False”.

However, the “they” pronoun from the context of previous sentence,
refers to “the comparison operation”, not “operator”. So, it conjures
the reader to think about some “operation precedence”, which in itself
cannot be ruled out as nonsense depending on the context. Very stupid
confusional writing.

And, from pure writing aspect, the sentence “...(which is ...)” is some
kind of a juvenile latch on. If the author intent to make that point,
say it in its own sentence. e.g. The comparison operators have higher
precedence than boolean operators. It would be better to not mention
this at all. For practical considerations, in the rare case of mixing
boolean and comparison operators, parenthesis are likely used and is
indeed a good practice. The proper place for operator precedence is a
table list all such, giving a clear view, in the appendix of the
language spec.

• Problem: Comparisons can be chained arbitrarily; for example, x < y
<= z is equivalent to x < y and y <= z, except that y is evaluated only
once (but in both cases z is not evaluated at all when x < y is found
to be false).

Drop the word “arbitrarily”. It has no meaning here.

the whole sentence is one verbiage of pell-mell thinking and writing.
Here's one example of better:

Comparisons can be chained, and is evaluated from left to right. For
example, x < y <= z is equivalent to (x < y) <= z.

With respect to documentation style, it is questionable that this
aspect needs to be mentioned at all. In practice, if programers need to
chain comparisons, they will readily do so. This is not out of ordinary
in imperative languages, and evaluation from left to right is also not
extraordinary to cost a mention.

• Problem: <> and != are alternate spellings for the same operator. !=
is the preferred spelling; <> is obsolescent

Bad choice of term “spellings”. Computer language operators are not
known as “spellings”.

Better: “<>” is equivalent to “!=”.

If “<>” is not likely to go out in future versions, don't even mention
about “preference”, because it serves no effective purpose. (if one
wants to wax philosophical about “programing esthetics”, go nag it
outside of language documentation.)

In general, when something is obsolete or might go defunct in the
future, consider not even mentioning that construct. If necessary, add
it in a obscure place, such as in a appendix of obsolete list, and not
adjacent to critical info. In many places of Python documentation, this
is breached.

This is just a quick partial analysis of one episode of incompetence i
see in Python docs in the past months i've had the pleasure to scan
here and there. A extreme pain in the ass.

I'm in fact somewhat surprised by this poor quality in writing. The
more egregious error is the hardware-oriented organization aka
technical drivel. But that i accept as common in imperative language
communities and in general the computing industry. But the poor quality
in the effectiveness and clarity of the writing itself surprised me. As
exhibited above, the writing is typical of programers, filled with
latch on sentences and unclear thinking. (they in general don't have a
clear picture of what they are talking about, and in cases they do,
they don't have the skills to express it effectively. (just as a
footnote: this writing problem isn't entirely the fault of programers
or Python doc writers. In part the English language (or in general
natural languages) are to blame, because they are exceptionally
convoluted and really take years to master as a art by itself.))

The Python doc, though relatively incompetent, but we can see that the
authors did care about quality. This is in contrast to documentations
in unix related things (unix tools, perl, apache, and so on), where the
writers have absolutely no sense of clear writing, and in most cases
don't give a damn and delight in drivel thinking of it as literary
artistry. A criminal of this sort that does society huge damage is
Larry Wall and the likes of his cohorts in the unix community.
(disclaimer: this is a piece of opinion.)

addendum: quality writing takes time. Though, the critical part lies
not in the mastery of writing itself, but in clarity of thinking of
what exactly one wants to say. So, next time you are writing a tech
doc, first try to have a precise understanding of the object, and then
know exactly what is that you want to say about it, then the writing
will come out vastly better. If the precise understanding of the object
is not readily at hand (which is natural and common and no need to
fidget about), being aware of it helps greatly in its exposition.

This and past critics on Python documentation and IT doc in general is
archived at
http://xahlee.org/Periodic_dosage_dir/t2/xlali_skami_cukta.html

   Xah Lee

☄

#71 From: xah lee <xah@...>
Date: Tue Mar 22, 2005 5:03 pm
Subject: 20050322 sorting a matrix
p0lyglut
Online Now Online Now
Send Email Send Email
 
Today we'll write a program that can sort a matrix in all possible
ways.

Here's the Perl documentation. I'll post a Perl and Python version in 2
days.

-----------

sort_matrix( $matrix, [[$n1, $stringQ, $directionQ], [$n2, $stringQ,
$directionQ], ...]) sorts a matrix by $n1 th column then $n2 th...and
so on.

$matrix must be a reference to references of arrays, having the form
[[$e1, $e2,...], [...], ...].  $stringQ is a boolean indicating
whether to treat corresponding columns as a strings instead of as
number in the sorting process. True means string. $directionQ is a
boolean indicating ascending sort or not for the correpsonding
column. In the column spec $n1 $n2 ..., index counting starts at 0.

   Example:

   my $ref_matrix =
   [
     [3, 99, 'a'],
     [2, 77, 'a'],
     [1, 77, 'a']
   ];

sort_matrix( $ref_matrix,  [ [2,1,1], [1,0,1] ]);
# this means sort by third column, regarding it as strings,
# and in ascending order. If tie, sort by second column,
# regarding it as number, in ascending order.

# returns [[2,77,'a'],[1,77,'a'],[3,99,'a']];

------------------

Note: in the above, ignore the must be a reference to references of
arrays. That's technical point, because Perl the language do nested
lists thru workaround of references.

http://xahlee.org/perl-python/sort_matrix.html

#70 From: "Xah Lee" <xah@...>
Date: Sun Mar 20, 2005 11:32 pm
Subject: exercise: program to delete duplicate files
p0lyglut
Online Now Online Now
Send Email Send Email
 
Sorry i've been busy...

Here's the Perl code. I have yet to clean up the code and make it
compatible with the cleaned spec above. The code as it is performs the
same algorithm as the spec, just doesn't print the output as such. In
a few days, i'll post a clean version, and also a Python version, as
well a sample directory for testing purposes. (The Perl code has gone
thru many testings and is considered correct.)

The Perl code comes in 3 files as it is:

Combo114.pm
Genpair114.pm
del_dup.pl

The main program is del_dup.pl. Run it on the command line as by the
spec. If you want to actually delete the dup files, uncomment the
"unlink" line at the bottom. Note: the module names don't have any
significance.

Note: here's also these python files ready to go for the final python
version. Possibly the final propram should be just a single file...

Combo114.py
Genpair114.py

Here're the files: del_dup.zip

------
to get the code and full detail with latest update, please see:
http://xahlee.org/perl-python/delete_dup_files.html

  Xah
  xah@...
  http://xahlee.org/PageTwo_dir/more.html


-----------------------------------------------------
On Mar 9, 2005, at 4:57 AM, xah lee wrote:
here's a large exercise that uses what we built before.

suppose you have tens of thousands of files in various directories.
Some of these files are identical, but you don't know which ones are
identical with which. Write a program that prints out which file are
redundant copies.

Here's the spec...

#69 From: "Sean Gugler" <sgugler@...>
Date: Thu Mar 17, 2005 7:42 pm
Subject: RE: unicode study: print a range of unicode chars
drguzler
Offline Offline
Send Email Send Email
 
-----Original Message-----
l=[]
for i in range(0x0000, 0x0fff):
     l.append(eval('u"\\u%04x"' % i))

for x in l:
-----Original Message-----

can be simplified to:

for x in map (unichr, xrange(0x0000, 0x0fff)):


- Sean

Messages 69 - 98 of 127   Newest  |  < Newer  |  Older >  |  Oldest
Advanced
Add to My Yahoo!      XML What's This?

Copyright 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help