Previous Next Contents

4. Maintaining a WWW site or some Web Pages

If you have to maintain a web site or if you maintain at least a web page, you have to think about your offer to the network and you have to spend some thoughts about approaching the reader / user of your web pages.

4.1 The mainstream: HTML technical

Well, I'm not gonna tell you, how HTML is encoded an how you have to design your pages. I'll just give you some pointers where you can find more advanced information.

You should take a look at http://www.w3.org/ for the latest HTML language specification.

Take a look at the list at the end of this article, you'll find more hints, where to read on.

4.2 Some thoughts about bandwidth

Many users connect to the internet via slow speed modem lines. A speed range from 14,400 bps to 28,800 bps is state-of-art for "private sites". In europe, there are ISDN systems growing, but a speed of 64,000 bps isn't that more fast in comparison to - let's keep it simple - 10,000,000 bps ethernet. And 10 Mbps ethernet isn't really a high speed LAN connection nowadays.

As you realize that many users don't have this fast access to the net, you should keep in mind to put up the relation between information and bytes. Optimize it at 1:1 - if you can. You may use graphics in your web pages following the multi media trend, but always remember the goals of your page and of the graphic you're going to put in. If most of your users are connected via a small modem line and the graphic severes only for estethic reasons or some eye-catching effects, you'd better bann it from your pages, or -at least- rerender it to the smallest possible file size and use best compression. Your users will like it.

Always remember, nobody really likes an eye-catcher, that comes up about 3-5 minutes after the text message.

4.3 Some thoughts about server load

On a web server, there is normally at least one server task running. If this task reads a request from a http client, it duplicates itself (on Linux it's called forking) and the new copy serves the request, while the original keeps listening for new requests. After finishing the request, the copy terminates. (In fact, some servers - like the apache - always keep up a default of five ready waiting server copies for requests parallel to the master incarnation for speed reasons.)

Some web browsers like the Netscape Navigator series do many requests parallel on the same server, which increases the server load spend on the same user. These browsers e.g. retrieve the HTML page and parse them while retrieving and issue new requests for other information like the embedded graphics, applet files, sound files or any other additional mime-encoded data. In opposition, 'simple' browsers request and retrieve one file after another, which keeps the server per user load relation as low as possible.

Many users prefer browsers that use the multi request technique like the Netscape Navigator, because they bring up a more complete overview on the requested page before the single request browser does.

This is in my opinion because many page designers do stick on embedding the information into the graphics, denying the text-only browsers.

So, we - as server maintainers - got the problem, that most of the users cast multiple requests on out server within the same page retrival. We can limit this by limiting the server software not to serve more requests than "x" from the same requesting system at the same time. But how to get this "x" ? It's not easy to calculate and a lot of personal expirience on your site is necessary to depict it. But I'll give you some hints. We have to take our connection bandwidth into account, our server memory size, some feeling about our servers cpu/disk performance and ... well, that's enough for the first glimpse. You should take a look at the memory usage a single server task has. Then think, how many of them could kept in memory at all. Think, how many per cents of your web pages could remain in your servers disk cache. Optimize the count of web server tasks against the disk cache size and you're really near to your personal "x". Furthermore, you can put in other jobs the server got. E.g. if your system also serves for ftp, you might limit the maximum possible connections to keep up some minimum room for the ftp server task. If your web server also does some database services, you'd better keep up some cpu cycles and also shrink your "x". Play somewhat around with these values and test them. And (!) read the following chapter about CGI scripting, which also takes server performance and - depending on the CGI jobs - amount of memory.

4.4 CGI vs. Applet / Client side script

- to be written - sorry - overview ond advantage/disadvantage and hints when to use which.

4.5 Style ideas

Uh, a really difficult theme for beeing on a short sentence. I don't try to mix up your genious design ideas. Nor I'm gonna put you into my personal design strategies. I'd just like to add one or two statements to the above ideas on server load and bandwidth.

Numerous research on human behavior on user interfaces and on-screen presentation have brought out interesting results. There are some simple facts one should keep in mind designing WWW pages.

Did you know this ? If you'd like to get more information on that, search for GUI style guides and ergonomy research results done by many universities and software companies (including MS).

4.6 HTML editors under Linux

Hm, there are some. In fact, there are reported to be many. But as I already shot my shoot, I didn't test them all. But I am really curiosly looking forward to read the reports you're gonna mail.

vi, vim

vi and vim are perfectly usable for writing HTML code... (don't flame me on that) because HTML code only uses ASCII text chars. I don't want to give stuff for another editor war. Those who know vi/vim and use it daily can use it for HTML code either. You can make vi/vim help you developing HTML code by doing some macros for vi/vim. But as this is no VI-HOWTO, I'll leave this fact alone here. Just take it, that it is possible to use vi/vim for HTML editing (at least for some short changes). If you already know how to program vi/vim, you'll certainly know how to abstract for HTML either. If you don't do so, well, don't care.

emacs & XEmacs

- to be written - sorry -

asWedit

- to be written - sorry -

other pointers

Ah, there was some reference for a package named phoenix, based on tkWWW, but I was not able to get them running on my system. I think, it was a problem with my tcl/tk versions but you'll never know. I didn't spend much time around with them, so, maybe they'll run on your system both. Just go'n ask archie. Maybe, you can drop me a mail, if you are sucessful.

If you miss your faivorite HTML editor here, just write a mail to me. Maybe, I'll add some pointers to web pages about HTML editors for Linux to. Just send me some nice URL's.

4.7 Graphics

Thoughts, Ideas, Hints ? Well, you may read the comp.graphics newsgroup. And, you can visit http://www.w3.org/pub/WWW/Graphics/.

Format gif

GIF (Graphics Interchange Format) was introduced 1987 by Compuserve, Inc. an revised 1989. It uses a LZ algorythm, which underlies U.S. copyright or patent law. So there might exist some legal problems using this graphics format in the internet - despite the fact that nearly anybody does.

Gif is a good format for small pictures with simple structured graphics like computer graphics or banners.

Gif has some advantages as it is one of the (if not the) widest spread graphic formats in online systems:

The disadvantages are:

Format jpeg

The Joint Graphic Experts Group (JPEG) did the design for the jpeg/jpg/jiff graphic format. This format is based on a discrete cosinus transformation (DCT) and a Huffmann encode compression. JPEG works with an significant information loss, which can make your pictures somewhath less colorous or less sharp. Typical compression factor is 1:5 ranging to 1:50. (Above 1:10 anybody is able to see the artefacts risen through the compression/decompression cyle.)

JPEG is a good format for photographies, large graphics and really complex pictures.

The advantages are:

The disadvantages are:

Format png

Portable Network Graphics (PNG) - the new format on the net. PNG is favorised by the W3 consortium. For some more special information visit http://www.w3.org/pub/WWW/TR/WD-png.html and http://www.w3.org/pub/WWW/Graphics/PNG/Overview.html. Here you'll find a technical specification, some programmers information etc. PNG is a ideal format replacing GIF. The PNG homepage is on http://quest.jpl.nasa.gov/PNG/. For the users, PNG will have some advantages and some disadvantages. Here they are:

For the advantages:

For the disadvantages:

PNG is currently supported on Linux through the following programs: ImageMagick (Version >=3.7), GhostScript 4.0, Gimp, PovRay 3.0, the netpbm package. For xv 3.10a there exists an inofficial patch.

Converters

- to be written - sorry - netpbm, xv, ghostscript, gimp, ImageMagick, CorelDraw auf Wine :-)))

4.8 Specials

There are now many specials beyond the HTML'n'Image range. There are Applets written in Java and JavaScript pages and many things beyond.

Java

There is nothing to add about Java in general, just read the java section in the Netscape Navigator chapter of this HOWTO and the overview on Java Applett vs. CGI script in this HOWTO. Then, you can also read the really good and compact Linux JAVA HOWTO. For programming Java, please refer really good books on that.

ActiveX

ActiveX is at the time of writing still a Microsoft child. Microsoft claimed, that they would release it to the public domain or at least to release it to a ActiveX consortium.

ActiveX has nothing to do with the X Window system nor with XFree.

It is derived from the Microsoft and IBM OLE system. After releasing the specs, there should be a Unix port. But, we have to wait till then. Nothing for Linux, yet.


Previous Next Contents