MoonPoint Support Weblog

Perl Sleep Function

Perl has a function sleep that will cause a Perl script to wait, i.e., "sleep" a specified number of seconds before going on to the next step in a script.

Syntax

sleep EXPR

Definition and Usage

Causes the script to sleep for EXPR seconds, or forever if no EXPR. May be interrupted if the process receives a signal such as "SIGALRM". Returns the number of seconds actually slept. You probably cannot mix "alarm" and "sleep" calls, because "sleep" is often implemented using "alarm".

On some older systems, it may sleep up to a full second less than what you requested, depending on how it counts seconds. Most modern systems always sleep the full amount. They may appear to sleep longer than that, however, because your process might not be scheduled right away in a busy multitasking system.

For delays of finer granularity than one second, you may use Perl’s "syscall" interface to access setitimer(2) if your sys- tem supports it, or else see "select" above. The Time::HiRes module (from CPAN, and starting from Perl 5.8 part of the stan- dard distribution) may also help.

Example

my $sleeptime = 3; # Number of seconds to "sleep" sleep($sleeptime);

Note

If you use sleep alone without any value given to it for the sleep period, then the script will sleep indefinitely.

[/languages/perl] permanent link

Mon, May 28, 2012 3:16 pm

Wide Character in Print Warning from Perl Script

When I ran a Perl script I wrote to download webpages from a website, I kept getting two warning messages printed for the same line whenever I ran the script.

Wide character in print at ./get_webpage.pl line 39.
Wide character in print at ./get_webpage.pl line 39.

I found at Unicode-processing issues in Perl and how to cope with it, written by Ivan Kurmanov, that the problem can occur when a file has Unicode characters in it. Unicode is a "computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems."

Unicode can be implemented by different character encodings. The most commonly used encodings are UTF-8, UTF-16 and the now-obsolete UCS-2. UTF-8 uses one byte for any ASCII characters, which have the same code values in both UTF-8 and ASCII encoding, and up to four bytes for other characters. UCS-2 uses a 16-bit code unit (two 8-bit bytes) for each character but cannot encode every character in the current Unicode standard. UTF-16 extends UCS-2, using two 16-bit units (4 x 8 bit) to handle each of the additional characters.

The webpages I was downloading were encoded using UTF-8, which I confirmed by viewing the source code for one of the webpages I was downloading. In the "head" section of the webpage, I saw the following meta tag.

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

Line 39 in my Perl script, which printed the webpage, was as follows:

print WEBPAGE $page;

Prior to that line I had the following line:

open(WEBPAGE,">$first_fname") || die "$page$n could not be opened";

The variable first_fname is the name for the file holding the first page saved to disk.

To resolve the problem, I followed the suggestion offered by Ivan to use open FILE, ">:utf8", $filename;. I therefore changed the line in my script to the following:

open(WEBPAGE,">:utf8",$first_fname) || die "$page$n could not be opened";

For subsequent pages, I used an id number followed by ".html" for the filename for the webpages with the id number changing based on the contents of the webpage downloaded. I enclosed ">:utf8" and "$id.html" in double quotes to get the script to run without producing the "wide character in print" warning nor any error messages.


     if ($n == 1) {
       # First page will be named differently, e.g., index.html
       open(WEBPAGE,">:utf8",$first_fname) || die "$page$n could not be opened";
     }
     else {
       open(WEBPAGE,">:utf8", "$id.html") || die "$page$n could not be opened";
     }
     print WEBPAGE $page;
     close(WEBPAGE);
     $n++;

[/languages/perl] permanent link

Mon, May 28, 2012 2:52 pm

Interactive Spelling Checker aspell

For interactive spellchecking on Linux systems, you can use the aspell command at a shell prompt. Just type the command followed by the option list then hit enter and type one or more words for which you wish to check the spelling. Hit Ctrl-D when you've entered all the words for which you wish to check the spelling. You will then see the words that are misspelled, or at least the ones for which aspell doesn't recognize the spelling, displayed.

You can put the words one per line, if you wish, as in the example below. In that example, "cat" was the first word I entered and "apropos" was the last word I entered before hitting Ctrl-D. The aspell spellchecker showed "cateb" and "flummoxe" as the misspelled words immediately after the last word I entered.

$ aspell list
cat
cart
cateb
flummox
flummoxe
apropos
cateb
flummoxe

You can also enter all of the words you want checked on one line, if you prefer, as in the example below, where "cat horse dog punto giraffe" were typed. When I hit Enter and Ctrl-D to terminate word entry, aspell showed "punto" as the misspelled word, since though it is Italian for the English word "point", it is not a valid word in English..

$ aspell list
cat horse dog punto giraffe
punto

[/os/unix/commands] permanent link

Mon, May 28, 2012 9:02 pm

Mon, May 28, 2012 3:16 pm

Mon, May 28, 2012 2:52 pm