This article provides an overview on charater sets.The earlier character sets which were used on web pages and electronic documents was ASCII or American Standard Code for Information Interchange. ASCII was created in late fifties. It assigned machine readable codes about 128 characters to upper and lower case Roman alphabets, punctuations, and feeds and tabs the control characters. ASCII is now called as US-ASCII.
ASCII was upgraded by ECMA or European Manufacturer’s association the organization in 1980 improved and expanded ASCII to around 256 -character sets. This upgraded version covered European alphabets and languages as well as that of Middle East upto Yemen.The new character sets were approved by ISO and each of the character set is referred to as 8859 family. They also retain the first 128 characters of ASCII with addition of special character unique to various languages covered.
The English web sites use ISO 8859-1 character set also called as Latin 1.
The development of unicode was a great upward push with approx. 100000 characters in the recent version 4. It incorporates all characters of ISO 8859 sets.
All these character sets are available to web designers and are compatibly with web browers and servers. Web browsers use default setting for character sets untill unless instructed or enabled in the configuration files of web servers…example the Content-Type HTTP headers.
But this may not work perfectly in case of shared hosting environments where a variety of different language websites are being hosted. Hence it is safer to specify the character sets using <meta> during the web designing process.
<meta-http-equiv=”Content-Type” content=”text/html; charset=iso-8859-1>
For a website development of pages with mix up of languages, it is best to use the unicode which overlaps Latin-1 and hence the English characters will be compatible.
<meta-http-equiv=”Content-Type” content=”text/html; charset=utf-8>
Sometimes it may be required to modify/override server settings in .htaccess
files since it may not allow the browser setting to function. Charater sets can be modified for pages on websites with different languages by instructing the .httaccess files.
The <meta> tag declaration should always be in the first line following the <head> tag.