Archive for June, 2012

Smashing Daily #18: Hiring, Detecting, Singing


  

Here’s episode #18 of The Smashing Daily, with the question if designers must know how to code, an in-depth look at type classification, lots of great stuff about Responsive Design, things about content and much more. Enjoy!

“We don’t hire designers who can’t code�
Many developers think that designers should be able to code, but most designers think that’s not necessary. Here’s a very interesting article by Roger Davis about this subject. Interesting, because Roger has been working in an environment that forced him to code a long time ago, so he knows what it means to both design and code. A great read for designers and developers.

Should designers code?

“Adactio: Journal—Responsive questionsâ€�
A while ago Jeremy Keith was interviewed about Responsive Design. It turned out to be a very interesting read, with good questions and good answers, too. I liked Jeremy’s answer about the fact that it took so long to finally start using fluid layouts: “While media queries are a relatively recent innovation, we’ve always had the ability to create fluid layouts. And yet Web designers and developers have willfully ignored that fact, choosing instead to create un-webby fixed-width layouts.” Exactly my thoughts. You should read this interview, it’s really good!

“Type classifications are useful, but the common ones are not�
For anyone who has ever tried to manage a large type library: how do you do that? How do you classify all those different types? In this article Indra Kupferschmid looks at the history of type classification and at current models, proposing some new ways to approach them. A long, very well researched article, and a great read!

“A New Take on Responsive Tables�
Tables are extremely hard to get right on small screens. Several possible solutions have been created, and here’s a new one which uses a fixed left header and a scrollable table body. The only issue with this solution is that it doesn’t work on a large portion of the current phones out there.

A responsive tables solution

“Taking “Content First� Very Seriously�
When Cloudfour decided to redesign their website they decided to take a “Content First” approach. Lyza Gardner wrote this post about the process and about the things they found out while building the website.

“What’s the greater fear for publishers: Amazon or piracy?�
Book publishers have been looking at the wrong enemy for too long: they always thought piracy would be a problem for eBooks, so they embraced eBook formats that support DRM. It turns out that the biggest book seller (who also has its own eBook format), has turned into the biggest enemy.

“Using Modernizr to detect HTML5 features and provide fallbacks�
One of the most interesting things about Web development (but also one of the more complex things) is the fact that you can create fallbacks for older browsers. One great tool that can help you out in making these fallbacks is Modernizr. Tom Leadbetter wrote this in depth article about how it works and about what you can do with it. I highly recommend reading it, it’s a great tool and a great article.

Modernizr

Last Click

“Like A Rounded Corner (Bruce and The Standardettes)�
For all you webstandards nerds and CSS3 designers out there, here’s the brilliant song Like A Rounded Corner by Bruce Lawson and The Standardettes. I laughed out loud at “styelsheet masturbation has replaced my imagination” and at “HTML5, so it is iOs ready for you”. Hilarious.

Hahaha

Previous Issues

For previous Smashing Daily issues, check out the Smashing Daily Archive.


© Vasilis van Gemert for Smashing Magazine, 2012.


Showcase of Awesome e-Commerce Platforms for Your Website


  

With the advent of online shopping and marketing, e-commerce websites have become the norm for businesses, both big and small. The number of online shopping stores as well as vendors and firms selling their products or services on the internet is increasing everyday, and this has added to the importance of capable e-commerce platforms.

And when it comes to e-commerce platforms, the options are numerous. In this article, we take a look at some of the best known e-commerce platforms to help you create your own online store. Most such platforms offer a similar set of features – ranging from awesome integration with payment gateways to secure shopping carts. Some of these platforms also come with hosting plans of their own, as we shall soon see. So without further ado, let’s take a look at the major e-commerce platforms.

On With the Show-case

Magento

Magento is one of the most popular e-commerce CMSs that boasts of having over 100,000 users. Magento offers separate solutions based on your business needs, including Enterprise and Small Business versions. The company also conducts Developer Certification programs.

Magento

Highwire

Highwire is a unique solution in the sense that apart from setting up your own store, you can also use it to sell on eBay or Facebook. Plus, you can sync your inventory across multiple channels of e-commerce – thus, no matter where you decide to sell, be it your own site or eBay, you can always keep track of all orders and payments from one centralized system. It comes loaded with excellent mobile-friendly features (including iOS and Android apps). While most features are Premium, there is also a Free plan.

Highwire

Shopify

Shopify is an easy to use hosted e-commerce solution that lets you create your online store within minutes. It comes loaded with secure shopping carts, beautiful e-commerce templates as well as web hosting. Before you purchase Shopify, you can also opt for a risk-free 30-day trial.

Shopify

CubeCart

CubeCart is a flexible e-commerce platform that offers a hassle-free way to set up your own e-commerce website. CubeCart keeps it really simple when it comes to pricing – you can either go for the free Lite version which limits you to 100 customers and 250 products (no technical support, by the way), or you can pay a one-time fee of $180 and get the Pro version with tech support and unlimited features.

CubeCart

Zen Cart

Before anything else, Zen Cart has an awesome tagline – “the art of e-commerce�! Apart from that, Zen Cart is free and open source software that is pretty simple to use and setup.

Zen Cart

osCommerce

osCommerce is another open source e-commerce solution. It powers over 12,000 websites and has an active community of over 260,000 members. osCommerce is licensed under a GPL and is well supplemented by addons.

osCommerce

Volusion

Volusion is an all-in-one hosted e-commerce solution. So unlike Zen Cart, you won’t really be able to download-and-install Volusion for free, but their pricing is competitive, and Volusion also comes with a 14-day free trial. The product is backed by 24×7 support as well.

Volusion

Ubercart

Ubercart isn’t really a full-fledged CMS in its own right. Instead, it is a shopping cart solution that you can integrate within your Drupal-powered website. So Ubercart will probably not be the most powerful bet for your e-commerce needs, but if you are running a Drupal-based website and wish to have features such as paid or Premium downloads, shopping cart, etc. you can consider using Ubercart. And Ubercart is free and open source.

Ubercart

BigCommerce

BigCommerce lets you sell your products on Facebook and eBay as well as your own store. If you wish to sell unlimited products on your website, you’ll have to opt for the Platinum Plan at $149.95 per month. There is also a free 15-day trial.

BigCommerce

FoxyCart

FoxyCart is a unique e-commerce product – unlike the rest, it does not claim to be the one-stop solution to all your e-commerce woes. It is not a CMS in its own right, and does not come with specialized tools for analytics or inventory. Instead, FoxyCart simply integrates itself seamlessly within your existing design and functionality.

FoxyCart

E-junkie

E-junkie provides shopping cart functionality to help you sell products on your own website or websites such as eBay and CraigsList. It is one of the cheapest e-commerce solutions out there – 500 MB of storage (maximum of 120 products) is available for a monthly fee of $27.

E-junkie

SolidShops

SolidShops is a flexible e-commerce solution that comes loaded with web hosting as well. It offers you features such as daily backups, custom tax settings, Facebook stores and stock trackings. Plus, SolidShops also has native support for SEO.

SolidShops

Cart66

Cart66 is a plugin that transforms your WordPress-powered website into an e-commerce store. You can track your inventory, charge tax on the basis of ZIP code, specify currency, and perform several other functions. Cart66 also offers a Lite version that is free to use, but has limited functionality.

Cart66

VirtueMart

Just as Cart66 helps change WP websites into e-commerce stores, VirtueMart performs the same function for Joomla! websites. However, unlike Cart66, VirtueMart is not just open source but also free to download and use. Apart from that, VirtueMart focuses more on shopping cart functionality rather than full-fledged e-commerce features.

Virtue Mart

1ShoppingCart

1ShoppingCart is actually spread across two separate products – you can either choose a simple shopping cart software, or opt for the entire e-commerce solution. 1ShoppingCart boasts of features such as real-time shipping rates, PayPal integration, etc.

1ShoppingCart

Big Cartel

Big Cartel is a simple shopping cart primarily meant for designers and artists. It comes with a Free version, wherein you can sell maximum 5 products and cannot use your own domain. If you need additional functionality, you can opt for their paid plans.

Big Cartel

Loaded Commerce

Loaded Commerce is a platform that offers features such as shopping cart, inventory tracking, PayPal integration, and so on.

Loaded Commerce

Adobe Business Catalyst

Adobe Business Catalyst lets you create websites without using your server-side programming skills. In fact, BC is more of a hosted CMS for any genre of website, let alone e-commerce. You can either purchase just BC, or have it as part of Adobe Creative Cloud along with other products.

Adobe Business Catalyst

Closing Time

With that, we come to the end of this round-up. Which e-commerce platform or tool do you employ for your online store or that you have used for clients? Do share your thoughts with us in the comments below!

(rb)


Design Patterns: When Breaking The Rules Is OK


  

We’d like to believe that we use established design patterns for common elements on the Web. We know what buttons should look like, how they should behave and how to design the Web forms that rely on those buttons.

And yet, broken forms, buttons that look nothing like buttons, confusing navigation elements and more are rampant on the Web. It’s a boulevard of broken patterns out there.

This got me thinking about the history and purpose of design patterns and when they should and should not be used. Most interestingly, I started wondering when breaking a pattern in favor of something different or better might actually be OK. We all recognize and are quick to call out when patterns are misused. But are there circumstances in which breaking the rules is OK? To answer this question properly, let’s go back to the beginning.

The History of Design Patterns

In 1977, the architect Christopher Alexander cowrote a book named A Pattern Language: Towns, Buildings, Construction, introducing the concept of pattern language as “a structured method of describing good design practices within a field of expertise.â€� The goal of the book was to give ordinary people — not just architects and governments — a blueprint for improving their own towns and communities. In Alexander’s own words:

At the core… is the idea that people should design for themselves their own houses, streets and communities. This idea… comes simply from the observation that most of the wonderful places of the world were not made by architects but by the people.

Street cafe
Street cafe in San Diego (Image credit: shanputnam)

A pattern — whether in architecture, Web design or another field — always has two components: first, it describes a common problem; secondly, it offers a standard solution to that problem. For example, pattern 88 in A Pattern Language deals with the problem of identity and how public places can be introduced to encourage mixing in public. One of the proposed solutions is street cafes:

The street cafe provides a unique setting, special to cities: a place where people can sit lazily, legitimately, be on view, and watch the world go by. Therefore: encourage local cafes to spring up in each neighborhood. Make them intimate places, with several rooms, open to a busy path, where people can sit with coffee or a drink and watch the world go by. Build the front of the cafe so that a set of tables stretch out of the cafe, right into the street. The most humane cities are always full of street cafes.

For those interested in going further down the pattern 88 rabbit hole, there is even a Flickr group dedicated to examples of this pattern.

The jump from architecture to the Web was quite natural because the situation is similar: we have many common interaction problems that deserve standard solutions. One such example is Yahoo’s “Navigation Tabs� pattern. The problem:

The user needs to navigate through a site to locate content and features and have clear indication of their current location in the site.

And the solution:

Presenting a persistent single-line row of tabs in a horizontal bar below the site branding and header is a way to provide a high level navigation for the website when the number of categories is not likely to change often. The element should span across the entire width of the page using limited as well as short and predictable titles with the current selected tab clearly highlighted to maintain the metaphor of file folders.

This is all very nice, but we need to dig deeper to understand the benefits of using such a pattern in digital product design.

The Benefits Of Design Patterns

Patterns are particularly useful in design for two main reasons:

  • Patterns save time because we don’t have to solve a problem that’s already been solved. If done right, we can apply the principles behind each pattern to solve other common design problems.
  • Patterns make the Web easier to use because, as adoption increases among designers, users get used to how things work, which in turn reduces their cognitive load when encountering common design elements. To put it in academic terms, when patterns reach high adoption rates, they become mental models — sets of beliefs in the user’s mind about how a system should work.

Perhaps the strongest case for using existing design patterns instead of making up new ones comes (once again) from architecture. In the article “The Value of Unoriginality,� Dmitri Fadeyev quotes Owen Jones, an architect and influential design theorist of the 19th century, from his book The Grammar of Ornament:

To attempt to build up theories of art, or to form a style, independently of the past, would be an act of supreme folly. It would be at once to reject the experiences and accumulated knowledge of thousands of years. On the contrary, we should regard as our inheritance all the successful labours of the past, not blindly following them, but employing them simply as guides to find the true path.

That last sentence is key. Patterns aren’t excuses to blindly copy what others have done, but they do provide blueprints for design that can be extremely useful to designers and users. And so we do need to stand on the shoulders of designers who have come before us — for the good of the Web and users’ sanity. Many have tried to document the most common Web design patterns, with varying levels of success. In addition to the Yahoo Design Pattern Library, there’s Peter Morville’s Design Patterns, Welie.com and, my personal favorite, UI-Patterns.com.

When Patterns Attack

Here’s the “but� to everything we’ve discussed up to now. There is a dark side to patterns that we don’t talk about enough. One doesn’t simply copy a pattern library from a bunch of random places, put it on an internal wiki and then wait for the magic to happen. Integrating and maintaining an internal design pattern library is hard work, and if we don’t take this work seriously, bad things will happen. Stephen Turbek sums up the main issues with pattern libraries in his article “Are Design Patterns an Anti-Pattern?�:

  • Design patterns are not effective training tools.
  • Design patterns don’t replace UX expertise.
  • Completeness and learn-ability are in conflict.
  • Design patterns take a lot of investment.
  • Design patterns should help non–UX people first.

This article isn’t meant to discuss these issues in detail, so I highly recommend reading Turbek’s post.

For the purpose of this article, let’s assume we’ve done everything right. We have a published and well-known pattern library that enjoys wide adoption within our organization. We treat the libraries as guidelines and blueprints, not laws to be followed without thinking about the problem at hand. The question I’m particularly interested in is, when is it OK to break a widely adopted design pattern and guide users to adopt a new way of solving a problem?

When We Attack Patterns

Despite all of their benefits, most of the Web seems to have little respect for patterns. The most glaring examples of broken design patterns are found in Web forms. Based on years of research, we know how to design usable forms. From Luke Wroblewski’s book Web Form Design to countless articles on things like multiple-column layouts and positioning of labels, we don’t have to guess any more. The patterns are there, and they’re well established. And yet, we see so many barely usable forms online.

As an example of a broken form pattern, look at the registration form for Expotel below:

Notice the small input fields; the left-aligned labels, with the miles of space between them and the input fields; the placement and design of the “Close� and “Register� buttons, which actually emphasize “Close� more. Oh, and what is a “Welcome Message�? Where will it be used? We can all agree that this is not good form design and is not a good way to break a pattern.

But passing judgment on a broken pattern is not always as easy as it is with the example above. Google’s recent decision to remove the “+� from the button to open a new tab in Chrome came under a bit of fire recently. It breaks a pattern that has been included in most browsers that have tab-based browsing as a feature, and yet Google claims that it did user research before making this change. Was this the right decision?

Google New Tab

And then there are UIs that we might not know what to make of. iOS apps such as Clear and Path introduce new interactions that we haven’t seen before — to much praise as well as negative feedback. A step forward in design or failed experiments?

As with most design decisions, the answers are rarely clear or black and white. A tension exists between patterns and new solutions that cannot be resolved with a formula. Users are familiar with the established way of doing things, yet a new solution to the problem might be better and even more natural and logical. So, when is changing something familiar to something different OK? There are two scenarios in which we should consider breaking a design pattern.

The New Way Empirically Improves Usability

One of the dangers of iterating on an existing design is what is known as the “local maximum.� As Joshua Porter explains:

The local maximum is a point at which you’ve hit the limit of the current design… It is as effective as it’s ever going to be in its current incarnation. Even if you make 100 tweaks, you can only get so much improvement; it is as effective as it’s ever going to be on its current structural foundation.

With patterns, it could happen that we continue to improve an existing solution even while a better one exists. This is one of the pitfalls of A/B testing: it does a great job of finding the local maximum, but not for finding those new and innovative solutions.

We gain much from incremental innovation, but sometimes a pattern is ripe for radical innovation. We need to go into every design problem with our eyes wide open, eager to find new solutions, and ready to test those solutions to make sure that we’re not following bad intuition. As Paul Scrivens points out in “Designing Ideas�:

You will never be first with a new idea. You will be first with a new way to present the idea or a new way to combine that idea with another. Ideas are nothing more than mashups of the past. Once you can embrace that, your imagination opens up a bit more and you start to look elsewhere for inspiration.

This is what the Chromium team claims to have done with the “+� button in Chrome. It believes it has found a better solution, and it’s tested it.

The Established Way Becomes Outdated

Think of the icon for the “save� action in most applications. When was the last time you saw a floppy drive? Exactly. Sometimes the world shifts beneath us, and we have to adjust. Failing to do so, we could get stuck in dangerous ruts, as Twyla Tharp attests (quoted by Yesenia Perez-Cruz):

More often than not, I’ve found a rut is the consequence of sticking to tried and tested methods that don’t take into account how you or the world has changed.

The publishing industry knows this better than most. Stewart Curry has this to say in “The Trope Must Die�:

Design patterns can be very useful, but when we’re making a big shift in media, they can sometimes hold progress back. If we look at the evolution of digital publications, it’s been a slow and steady movement from (in the most part) a printed page to reproducing that printed page on a digital device. It’s steady, linear, and not very imaginative, where “it worked in print, so it will work in digital� seems to be the mindset.

This is where the developers of apps such as Clear and Path are doing the bold, right thing. They realize that we’re at beginning of a period of rapid innovation in gesture-based interfaces, and they want to be at the forefront of that. Some ideas will fail and some will succeed, but it’s important that our design patterns respond to the new touch-based world we’re a part of.

Clear Todo App

Our design patterns have to adjust not only to a shift in our interaction metaphors, but to a significant shift in technology usage in general. Tammy Erickson did some research on what she calls the “Re-Generation� (i.e. post-Generation Y) and discusses some of her findings in “How Mobile Technologies Are Shaping a New Generation�:

Connectivity is the basic assumption and natural fabric of everyday life for the Re-Generation. Technology connections are how people meet, express ideas, define identities, and understand each other. Older generations have, for the most part, used technology to improve productivity — to do things we’ve always done, faster, easier, more cheaply. For the Re-Generation, being wired is a way of life.

Expectations of apps and services change when everything is always on and accessible. We become less tolerant of slow transitions and flows that are perceived to be too complex. We are being forced to rethink sign-up forms and payment flows in an environment where time and attention have become scarcer than ever. We don’t have to reinvent the wheel, but we do need to find better ways to keep it rolling.

The Informed Decision Is The Right Decision

Design patterns bring many benefits, as well as some drawbacks to watch out for. But we’d be foolish to ignore these helpful guidelines. There is no formula for what we need to do; rather, we need to operate within certain boundaries to ensure we’re creating great design solutions without alienating users. Here is what we need to do:

  • Study design patterns that are relevant to the applications we are working on. We need to know them by heart — and know why they exist — so that we can use them as loose blueprints for our own work.
  • Approach each new project with a mind open enough to discover better ways to solve recurring problems.
  • Stay up to date on our industry (as well as adjacent ones) so that we recognize external changes that require us to rethink solutions that currently work quite well but might be outdated soon.

In short, we can neither follow nor ignore design patterns completely. Instead, we need a deep understanding of the rules of human-computer interaction, so that we know when breaking them is OK.

(al)


© Rian van der Merwe for Smashing Magazine, 2012.


All About Unicode, UTF8 & Character Sets


  

This is a story that dates back to the earliest days of computers. The story has a plot, well, sort of. It has competition and intrigue, as well as traversing oodles of countries and languages. There is conflict and resolution, and a happyish ending. But the main focus is the characters — 110,116 of them. By the end of the story, they will all find their own unique place in this world.

This story will follow a few of those characters more closely, as they journey from Web server to browser, and back again. Along the way, you’ll find out more about the history of characters, character sets, Unicode and UTF-8, and why question marks and odd accented characters sometimes show up in databases and text files.

Warning: This article contains lots of numbers, including a bit of binary — best approached after your morning cup of coffee.

ASCII

Computers only deal in numbers and not letters, so it’s important that all computers agree on which numbers represent which letters.

Let’s say my computer used the number 1 for A, 2 for B, 3 for C, etc., and yours used 0 for A, 1 for B, etc. If I sent you the message HELLO, then the numbers 8, 5, 12, 12, 15 would whiz across the wires. But for you, 8 means I, so you would receive and decode it as IFMMP. To communicate effectively, we would need to agree on a standard way of encoding the characters.

To this end, in the 1960s the American Standards Association created a 7-bit encoding called the American Standard Code for Information Interchange (ASCII). In this encoding, HELLO is 72, 69, 76, 76, 79 and would be transmitted digitally as 1001000 1000101 1001100 1001100 1001111. Using 7 bits gives 128 possible values from 0000000 to 1111111, so ASCII has enough room for all lower case and upper case Latin letters, along with each numerical digit, common punctuation marks, spaces, tabs and other control characters. In 1968, US President Lyndon Johnson made it official — all computers must use and understand ASCII.

Trying It Yourself

There are plenty of ASCII tables available, displaying or describing the 128 characters. Or you can make one of your own with a little bit of CSS, HTML and Javascript, most of which is to get it to display nicely:

<html>
<body>
<style type="text/css">p {float: left; padding: 0 15px; margin: 0; font-size: 80%;}</style>
<script type="text/javascript">
for (var i=0; i<128; i++) document.writeln ((i%32?'':'<p>') + i + ': ' + String.fromCharCode (i) + '<br>');
</script>
</body>
</html>

This will display a table like this:

Do-It-Yourself Javascript ASCII table viewed in Firefox
Do-It-Yourself Javascript ASCII table viewed in Firefox.

The most important bit of this is the Javascript String.fromCharCode function. It takes a number and turns it into a character. In fact, the following four lines of HTML and Javascript all produce the same result. They all get the browser to display character numbers 72, 69, 76, 76 and 79:

HELLO
&#72;&#69;&#76;&#76;&#79;
<script>document.write ("HELLO");</script>
<script>document.write (String.fromCharCode (72,69,76,76,79));</script>

Also notice how Firefox displays the unprintable characters (like backspace and escape) in the first column. Some browsers show blanks or question marks. Firefox squeezes four hexadecimal digits into a small box.

The Eighth Bit

Teleprinters and stock tickers were quite happy sending 7 bits of information to each other. But the new fangled microprocessors of the 1970s preferred to work with powers of 2. They could process 8 bits at a time and so used 8 bits (aka a byte or octet) to store each character, giving 256 possible values.

An 8 bit character can store a number up to 255, but ASCII only assigns up to 127. The other values from 128 to 255 are spare. Initially, IBM PCs used the spare slots to represent accented letters, various symbols and shapes and a handful of Greek letters. For instance, number 200 was the lower left corner of a box: +, and 224 was the Greek letter alpha in lower case: a. This way of encoding the letters was later given the name code page 437.

However, unlike ASCII, characters 128-255 were never standardized, and various countries started using the spare slots for their own alphabets. Not everybody agreed that 224 should display a, not even the Greeks. This led to the creation of a handful of new code pages. For example, in Russian IBM computers using code page 885, 224 represents the Cyrillic letter ?. And in Greek code page 737, it is lower case omega: ?.

Even then, there was disagreement. From the 1980s Microsoft Windows introduced its own code pages. In the Cyrillic code page Windows-1251, 224 represents the Cyrillic letter a, and ? is at 223.

In the late 1990s, an attempt at standardization was made. Fifteen different 8 bit character sets were created to cover many different alphabets such as Cyrillic, Arabic, Hebrew, Turkish, and Thai. They are called ISO-8859-1 up to ISO-8859-16 (number 12 was abandoned). In the Cyrillic ISO-8859-5, 224 represents the letter ?, and ? is at 207.

So if a Russian friend sends you a document, you really need to know what code page it uses. The document by itself is just a sequence of numbers. Character 224 could be ?, a or ?. Viewed using the wrong code page, it will look like a bunch of scrambled letters and symbols.

(The situation isn’t quite as bad when viewing Web pages — as Web browsers can usually detect a page’s character set based on frequency analysis and other such techniques. But this is a false sense of security — they can and do get it wrong.)

Trying It Yourself

Code pages are also known as character sets. You can explore these character sets yourself, but you have to use PHP or a similar server side language this time (roughly because the character needs to be in the page before it gets to the browser). Save these lines in a PHP file and upload it to your server:

<html>
<head>
<meta charset="ISO-8859-5">
</head>
<body>
<style type="text/css">p {float: left; padding: 0 15px; margin: 0; font-size: 80%;}</style>
<?php  for ($i=0; $i<256; $i++) echo ($i%32?'':'<p>') . $i . ': ' . chr ($i) . '<br>'; ?>
</body>
</html>

This will display a table like this:

Cyrillic character set ISO-8859-5 viewed in Firefox
Cyrillic character set ISO-8859-5 viewed in Firefox.

The PHP function chr does a similar thing to Javascript’s String.fromCharCode. For example, chr(224) embeds the number 224 into the Web page before sending it to the browser. As we’ve seen above, 224 can mean many different things. So, the browser needs to know which character set to use to display the 224. That’s what the first line above is for. It tells the browser to use the Cyrillic character set ISO-8858-5:

<meta charset="ISO-8859-5">

If you exclude the charset line, then it will display using the browser’s default. In countries with Latin-based alphabets (like the UK and US), this is probably ISO-8859-1, in which case 224 is an a with grave accent: à. Try changing this line to ISO-8859-7 or Windows-1251 and refresh the page. You can also override the character set in the browser. In Firefox, go to View > Character Encoding. Swap between a few to see what effect it has. If you try to display more than 256 characters, the sequence will repeat.

Summary Circa 1990

This is the situation in about 1990. Documents can be written, saved and exchanged in many languages, but you need to know which character set they use. There is also no easy way to use two or more non-English alphabets in the same document, and alphabets with more than 256 characters like Chinese and Japanese have to use entirely different systems.

Finally, the Internet is coming! Internationalization and globalization is about to make this a much bigger issue. A new standard is required.

Unicode To The Rescue

Starting in the late 1980s, a new standard was proposed – one that would assign a unique number (officially known as a code point) to every letter in every language, one that would have way more than 256 slots. It was called Unicode. It is now in version 6.1 and consists of over 110,000 code points. If you have a few hours to spare you can watch them all whiz past.

The first 128 Unicode code points are the same as ASCII. The range 128-255 contains currency symbols and other common signs and accented characters (aka characters with diacritical marks), and much of it is borrowed ISO-8859-1. After 256 there are many more accented characters. After 880 it gets into Greek letters, then Cyrillic, Hebrew, Arabic, Indic scripts, and Thai. Chinese, Japanese and Korean start from 11904 with many others in between.

This is great – no more ambiguity – each letter is represented by its own unique number. Cyrillic ? is always 1071 and Greek a is always 945. 224 is always à, and H is still 72. Note that these Unicode code points are officially written in hexadecimal preceded by U+. So the Unicode code point H is usually written as U+0048 rather than 72 (to convert from hexadecimal to decimal: 4*16+8=72).

The major problem is that there are more than 256 of them. The characters will no longer fit into 8 bits. However Unicode is not a character set or code page. So officially that is not the Unicode Consortium’s problem. They just came up with the idea and left someone else to sort out the implementation. That will be discussed in the next two sections.

Unicode Inside The Browser

Unicode does not fit into 8 bits, not even into 16. Although only 110,116 code points are in use, it has the capability to define up to 1,114,112 of them, which would require 21 bits.

However, computers have advanced since the 1970s. An 8 bit microprocessor is a bit out of date. New computers now have 64 bit processors, so why can’t we move beyond an 8 bit character and into a 32 bit or 64 bit character?

The first answer is: we can!

A lot of software is written in C or C++, which supports a “wide character”. This is a 32 bit character called wchar_t. It is an extension of C’s 8 bit char type. Internally, modern Web browsers use these wide characters (or something similar) and can theoretically quite happily deal with over 4 billion distinct characters. This is plenty for Unicode. So – internally, modern Web browers use Unicode.

Trying It Yourself

The Javascript code below is similar to the ASCII code above, except it goes up to a much higher number. For each number, it tells the browser to display the corresponding Unicode code point:

<html>
<body>
<style type="text/css">p {float: left; padding: 0 15px; margin: 0; font-size: 80%;}</style>
<script type="text/javascript">
for (var i=0; i<2096; i++)
  document.writeln ((i%256?'':'<p>') + i + ': ' + String.fromCharCode (i) + '<br>');
</script>
</body>
</html>

It will output a table like this:

A selection of Unicode code points viewed in Firefox
A selection of Unicode code points viewed in Firefox

The screenshot above only shows a subset of the first few thousand code points output by the Javascript. The selection includes some Cyrillic and Arabic characters, displayed right-to-left.

The important point here is that Javascript runs completely in the Web browser where 32 bit characters are perfectly acceptable. The Javascript function String.fromCharCode(1071) outputs the Unicode code point 1071 which is the letter ?.

Similarly if you put the HTML entity &#1071; into an HTML page, a modern Web browser would display ?. Numerical HTML entities also refer to Unicode.

On the other hand, the PHP function chr(1071) would output a forward slash / because the chr function only deals with 8 bit numbers up to 256 and repeats itself after that, and 1071%256=47 which has been a / since the 1960s.

UTF-8 To The Rescue

So if browsers can deal with Unicode in 32 bit characters, where is the problem? The problem is in the sending and receiving, and reading and writing of characters.

The problem remains because:

  1. A lot of existing software and protocols send/receive and read/write 8 bit characters
  2. Using 32 bits to send/store English text would quadruple the amount of bandwidth/space required

Although browsers can deal with Unicode internally, you still have to get the data from the Web server to the Web browser and back again, and you need to save it in a file or database somewhere. So you still need a way to make 110,000 Unicode code points fit into just 8 bits.

There have been several attempts to solve this problem such as UCS2 and UTF-16. But the winner in recent years is UTF-8, which stands for Universal Character Set Transformation Format 8 bit.

UTF-8 is a clever. It works a bit like the Shift key on your keyboard. Normally when you press the H on your keyboard a lower case “h” appears on the screen. But if you press Shift first, a capital H will appear.

UTF-8 treats numbers 0-127 as ASCII, 192-247 as Shift keys, and 128-192 as the key to be shifted. For instance, characters 208 and 209 shift you into the Cyrillic range. 208 followed by 175 is character 1071, the Cyrillic ?. The exact calculation is (208%32)*64 + (175%64) = 1071. Characters 224-239 are like a double shift. 226 followed by 190 and then 128 is character 12160: ?. 240 and over is a triple shift.

UTF-8 is therefore a multi-byte variable-width encoding. Multi-byte because a single character like ? takes more than one byte to specify it. Variable-width because some characters like H take only 1 byte and some up to 4.

Best of all it is backward compatible with ASCII. Unlike some of the other proposed solutions, any document written only in ASCII, using only characters 0-127, is perfectly valid UTF-8 as well – which saves bandwidth and hassle.

Trying It Yourself

This is a different experiment. PHP embeds the 6 numbers mentioned above into an HTML page: 72, 208, 175, 226, 190, 128. The browser interprets those numbers as UTF-8, and internally converts them into Unicode code points. Then Javascript outputs the Unicode values. Try changing the character set from UTF-8 to ISO-8859-1 and see what happens:

<html>
<head>
<meta charset="UTF-8">
</head>
<body>
<p>Characters embedded in the page:<br>
<span id="chars"><?php echo chr(72).chr(208).chr(175).chr(226).chr(190).chr(128); ?></span>
<p>Character values according to Javascript:<br>
<script type="text/javascript">
function ShowCharacters (s) {var r=''; for (var i=0; i<s.length; i++)
  r += s.charCodeAt (i) + ': ' + s.substr (i, 1) + '<br>'; return r;}
document.writeln (ShowCharacters (document.getElementById('chars').innerHTML));
</script>
</body>
</html>

If you are in a hurry, this is what it will look like:

A sequence of numbers shown using the UTF-8 character set
The sequence of numbers above shown using the UTF-8 character set

Same sequence of numbers shown using the ISO-8859-1 character set
Same sequence of numbers shown using the ISO-8859-1 character set

If you display the page  using the UTF-8 character set, you will see only 3 characters: H??. If you display it using the character set ISO-8859-1, you will see six separate characters: H�¯â¾€ . This is what is happening:

  1. On your Web server, PHP is embedding the numbers 72, 208, 175, 226, 190 and 128 into a Web page
  2. The Web page whizzes across the Internet from the Web server to your Web browser
  3. The browser receives those numbers and interprets them according to the character set
  4. The browser internally represents the characters using their Unicode values
  5. Javascript outputs the corresponding Unicode values

Notice that when viewed as ISO-8859-1 the first 5 numbers are the same (72, 208, 175, 226, 190) as their Unicode code points. This is because Unicode borrowed heavily from ISO-8859-1 in that range. The last number however, the euro symbol €, is different. It is at position 128 in ISO-8859-1 and has the Unicode value 8364.

Summary Circa 2003

UTF-8 is becoming the most popular international character set on the Internet, superseding the older single-byte character sets like ISO-8859-5. When you view or send a non-English document, you still need to know what character set it uses. For widest interoperability, website administrators need to make sure all their web pages use the UTF-8 character sets.

Perhaps the Ã� looks familiar – it will sometimes show up if you try to view Russian UTF-8 documents. The next section describes how character sets get confused and end up storing things wrongly in a database.

Lots Of Problems

As long as everybody is speaking UTF-8, this should all work swimmingly. If they aren’t, then characters can get mangled. To explain way, imagine a typical interaction a website, such as a user making a comment on a blog post:

  1. A Web page displays a comment form
  2. The user types a comment and submits.
  3. The comment is sent back to the server and saved in a database.
  4. The comment is later retrieved from the database and displayed on a Web page

This simple process can go wrong in lots of ways and produce the following types of problems:

HTML Entities

Pretend for a moment that you don’t know anything about character sets – erase the last 30 minutes from your memory. The form on your blog will probably display itself using the character set ISO-8859-1. This character set doesn’t know any Russian or Thai or Chinese, and only a little bit of Greek. If you attempt to copy and paste any into the form and press Submit, a modern browser will try to convert it into HTML numerical entities like &#1071; for ?.

That’s what will get saved in your database, and that’s what will be output when the comment is displayed – which means it will display fine on a Web page, but cause problems when you try to output it to a PDF or email, or run text searches for it in a database.

Confused Characters

How about if you operate a Russian website, and you have not specified a character set in your Web page? Imagine a Russian user whose default character set is ISO-8859-5. To say “hi”, they might type ??????. When the user presses Submit, the characters are encoded according to the character set of the sending page. In this case, ?????? is encoded as the numbers 191, 224, 216, 210, 213 and 226. Those numbers will get sent across the Internet to the server, and saved like that into a database.

If somebody later views that comment using ISO-8859-5, they will see the correct text. But if they view using a different Russian character set like Windows-1251, they will see ??????. It’s still Russian, but makes no sense.

Accented Characters with Lots of Vowels

If someone views the same comment using ISO-8859-1, they will see ¿àØÒÕâ instead of ??????. A longer phrase like ? ???? ???? ??? ?????? (“nice to see you” in a formal way to a female), submitted as ISO-8859-5, will show up in ISO-8859-1 as Ã� âÞÖÕ àÃ�ÔÃ�. It looks like that because the 128-255 range of ISO-8859-1 contains lots of vowels with accents.

So if you see this sort of pattern, it’s probably because text has been entered in a single byte character set (one of the ISO-8859s or Windows ones) and is being displayed as ISO-8859-1. To fix the text, you’ll need to figure out which character set it was entered as, and resubmit it as UTF-8 instead.

Alternating Accented Characters

What if the user submitted the comment in UTF-8? In that case the Cyrillic characters which make up the word ?????? would each get sent as 2 numbers each: 208/159, 209/128, 208/184, 208/178, 208/181 and 209/130. If you viewed that in ISO-8859-1 it would look like: �ŸÑ€�¸�²�µÑ‚.

Notice that every other character is a � or Ñ. Those characters are numbers 208 and 209, and they tell UTF-8 to switch to the Cyrillic range. So if you see a lot of � and Ñ, you can assume that you are looking at Russian text entered in UTF-8, viewed as ISO-8859-1. Similarly, Greek will have lots of Î and �, 206 and 207. And Hebrew has alternating ×, number 215.

Vowels Before a Pound and Copyright Sign

A very common issue in the UK is the currency symbol £ getting converted into £. This is exactly the same issue as above with a coincidence thrown in to add confusion. The £ symbol has the Unicode and ISO-8859-1 value of 163. Recall that in UTF-8 any character over 127 is represented by a sequence of two or more numbers. In this case, the UTF-8 sequence is 194/163. Mathematically, this is because (194%32)*64 + (163%64) = 163.

Visually it means that the if you view the UTF-8 sequence using ISO-8859-1, it appears to gain a  which is character 194 in ISO-8859-1. The same thing happens for all Unicode code points 161-191, which includes © and ® and ¥.

So if your £ or © suddenly inherit a Â, it is because they were entered as UTF-8.

Black Diamond Question Marks

How about the other way around? If you enter ?????? as ISO-8859-5, it will get saved as the numbers shown above: 191, 224, etc. If you then try to view this as UTF-8, you may well see lots of question marks inside black diamonds: ?. The browser displays these when it can’t make sense of the numbers it is reading.

UTF-8 is self-synchronzising. Unlike other multi-byte character encodings, you always know where you are with UTF-8. If you see a number 192-247, you know you are at the beginning of a multi-byte sequence. If you see 128-191 you know you are in the middle of one. There’s no danger of missing the first number and garbling the rest of the text.

This means that in UTF-8, the sequence 191 followed by 224 will never occur naturally, so the browser doesn’t know what to do with it and displays ?? instead.

This can also cause £ and © related problems. £50 in ISO-8859-1 is the numbers 163, 53 and 48. The 53 and 48 cause no issues, but in UTF-8, 163 can never occur by itself, so this will show up as ?50. Similarly if you see ?2012, it is probably because ©2012 was input as ISO-8859-1 but is being displayed as UTF-8.

Blanks, Question Marks and Boxes

Even if they are fully up-to-speed with UTF-8 and Unicode, a browser still may not know how to display a character. The first few ASCII characters 1-31 are mostly control sequences for teleprinters (things like Acknowledge and Stop). If you try to display them, a browser might show a ? or a blank or a box with tiny numbers inside it.

Also, Unicode defines over 110,000 characters. Your browser may not have the correct font to display all of them. Some of the more obscure characters may also get shown as ? or blank or a small box. In older browsers, even fairly common non-English characters may show as boxes.

Older browsers may also behave differently for some of the issues above, showing ? and blank boxes more often.

Databases

The discussion above has avoided the middle step in the process – saving data to a database. Databases like MySQL can also specify a character set for a database, table or column. But it is less important that the Web pages’ character set.

When saving and retrieving data, MySQL deals just with numbers. If you tell it to save number 163, it will. If you give it 208/159 it will save those two numbers. And when you retrieve the data, you’ll get the same two numbers back.

The character set becomes more important when you use database functions to compare, convert and measure the data. For example, the LENGTH  of a field may depend on its character set, as do string comparisons using LIKE and =. The method used to compare strings is called a collation.

Character sets and collations in MySQL are an in-depth subject. It’s not simply a case of changing the character set of a table to UTF-8. There are further SQL commands to take into account to make sure the data goes in and out in the right format as well. This blog is a good starting point.

Trying It Yourself

The following PHP and Javascript code allows you to experiment with all these issues. You can specify which character set is used to input and output text, and you can see what the browser thinks about it too.

<?php
$charset = $_POST['charset']; if (!$charset) $charset = 'ISO-8859-1';
$string = $_POST['string'];
if ($string) {
        echo '<p>This is what PHP thinks you entered:<br>';
        for ($i=0; $i<strlen($string); $i++) {$c=substr ($string,$i,1); echo ord ($c).': '.$c.' <br/>';}
}       
?>      
<html>
<head>
<meta charset="<?=$charset?>">
</head>
<body>
<form method="post">
<input name="lastcharset" type="hidden" value="<?php echo $charset?>"/>
Form was submitted as: <?php echo $_POST['lastcharset']?><br/>
Text is displayed as: <?php echo $charset?><br/>
Text will be submitted as: <?php echo $charset?><br/>
Copy and paste or type here:
<input name="string" type="text" size="20" value="<?php echo $string?>"/><br/>
Next page will display as:
<select name="charset"><option>ISO-8859-1<option>ISO-8859-5
<option>Windows-1251<option>ISO-8859-7<option>UTF-8</select><br/>
<input type="submit" value="Submit" onclick="ShowCharacters (this.form.string.value); return 1;"/>
</form>
<script type="text/javascript">
function ShowCharacters (s) {
  var r='You entered:';
  for (var i=0; i<s.length; i++) r += '\n' + s.charCodeAt (i) + ': ' + s.substr (i, 1);
  alert (r);
}
</script>
</body>
</html>

This is an example of the code in action. The numbers at the top are the numerical values of each of the characters and their representation (when viewed individually) in the current character set:

Example of inputting and output in different character sets
Example of inputting and output in different character sets. This shows a £ sign turning into a ? in Google Chrome.

The page above shows the previous, current and future character sets. You can use this code to quickly see how text can get really mangled. For example, if you pressed Submit again above, the ? has Unicode code point 65533 which is 239/191/189 in UTF-8 and will be displayed as �50  in ISO-8859-1. So if you ever get £ symbols turning into �, that is probably the route they took.

Note that the select box at the bottom will change back to ISO-8859-1 each time.

One Solution

All the encoding problems above are caused by text being submitted in one character set and viewed in another. The solution is to make sure that every page on your website uses UTF-8. You can do this with one of these lines immediately after the <head> tag:

<meta charset="UTF-8">
<meta http-equiv="Content-type" content="text/html; charset=UTF-8">

It has to be one of the first things in your Web page, as it will cause the browser to look again at the page in a whole new light. For speed and efficiency, it should do this as soon as possible.

You can also specify UTF-8 in your MySQL tables, though to fully use this feature, you’ll need to delve deeper.

Note that users can still override the character set in their browsers. This is rare, but does mean that this solution is not guaranteed to work. For extra safety, you could implement a back-end check to ensure data is arriving in the correct format.

Existing Websites

If your website has already been collecting text in a variety of languages, then you will also need to convert your existing data into UTF-8. If there is not much of it, you can use a PHP page like the one above to figure out the original character set, and use the browser to convert the data into UTF-8.

If you have lots of data in various character sets, you’ll need to first detect the character set and then convert it. In PHP you can use mb_detect_encoding to detect and iconv to convert. Reading the comments for  mb_detect_encoding, it looks like quite a fussy function, so be sure to experiment to make sure you are using it properly and getting the right results.

A potentially misleading function is utf8_decode. It turns UTF-8 into ISO-8859-1. Any characters not available in ISO-8859-1 (like Cyrillic, Greek, Thai, etc) are turned into question marks. It’s misleading because you might have expected more from it, but it does the best it can.

Summary

This article has relied heavily on numbers and has tried to leave no stone unturned. Hopefully it has provided an exhaustive understanding of character sets, Unicode, UTF-8 and the various problems that can arise. The morals of the story are:

  • You need to know the character set in order to make sense of non-Latin text
  • Internally, browsers use Unicode to represent characters
  • Make sure all your Web pages specify the UTF-8 character set

For a slightly different approach to this subject, this 2003 character set article is excellent. Thank you for sticking with this epic journey.

(il)

Image credits (front page): nevsred.


© Paul Tero for Smashing Magazine, 2012.


Tales From the Darkside: Dark and Surreal Photo Manipulations


  

Darker, surreal themes can often lead to some really interesting photo manipulations, as stylistically artist’s can stray into genres such as gothic, horror, the macabre and violence. Not the most pleasant themes for sure, but they can result in some very emotive, engaging pieces of art.

Photo manipulation is the ideal technique to work with dark, surreal subject matters, as everyday images can be warped and changed into something grotesque or unusual. For this type of composition regular scenes are also often manipulated to look darker or more menacing.

Common themes and images in dark and surreal photo manipulations include gothic feminine beauties, mist/fog, death/loss, nighttime, psychological stress/angst, falling, scenes that defy the laws of physics and personal contemplation/reflection.

As you can see from these themes and motifs dark and surreal photo manipulations are often about more than just looking attractive. They cover a wide range of serious matters, emotions and themes; and it’s up to the artist to capture them elegantly.

Today we present 40 stunning examples of dark and surreal photo manipulation art. We hope that they inspire you to try your own designs and venture into this genre.

Dark and Surreal Photo Manipulations

Fire Within Me by freaky665

Dark Surreal Photo Manipulation

Chronoscape – thundersnow by alexiuss

Dark Surreal Photo Manipulation

Noumeno by Blekotakra

Dark Surreal Photo Manipulation

Vampire Queen by Pygar

Dark Surreal Photo Manipulation

Acrimony by alexiuss

Dark Surreal Photo Manipulation

Hidden Intentions by freaky665

Dark Surreal Photo Manipulation

Symphony of Destruction by lady-symphonia

Dark Surreal Photo Manipulation

Mysteria by red-riding

Dark Surreal Photo Manipulation

Last Exit by OmeN2501

Dark Surreal Photo Manipulation

The Gathering by immanuel

Dark Surreal Photo Manipulation

I get what I deserve by freaky665

Dark Surreal Photo Manipulation

60568 by kubicki

Dark Surreal Photo Manipulation

The Laws of the Future by lady-symphonia

Dark Surreal Photo Manipulation

When a Part of Me Dies by lady-symphonia

Dark Surreal Photo Manipulation

Nocturna by lady-symphonia

Dark Surreal Photo Manipulation

The Awakening by freaky665

Dark Surreal Photo Manipulation

Dracula’s Bride Modern Edition by mary-petroff

Dark Surreal Photo Manipulation

Compagnia by Blekotakra

Dark Surreal Photo Manipulation

The Three Sisters by lady-symphonia

Dark Surreal Photo Manipulation

What shall we die for? by Azdup

Dark Surreal Photo Manipulation

If death is the end… by freaky665

Dark Surreal Photo Manipulation

Cradle of Filth art002 by NatalieShau

Dark Surreal Photo Manipulation

Night Butterfly by Wishmistress

Dark Surreal Photo Manipulation

Isolation by Wishmistress

Dark Surreal Photo Manipulation

A r a c h n i A by J-u-d-a-s

Dark Surreal Photo Manipulation

Tears of stone III: In memoriam by aphostol

Dark Surreal Photo Manipulation

Confessions by lady-symphonia

Dark Surreal Photo Manipulation

Egg Island by djajakarta

Dark Surreal Photo Manipulation

E x i l e by aphostol

Dark Surreal Photo Manipulation

Neil Gaiman’s Death III by hoschie

Dark Surreal Photo Manipulation

Porcelain Splinter by conzpiracy

Dark Surreal Photo Manipulation

Hecate by mari-na

Dark Surreal Photo Manipulation

Meme La Nuit Est Seule by valse-des-ombres

Dark Surreal Photo Manipulation

Harmony… by J-u-d-a-s

Dark Surreal Photo Manipulation

Nightshift by J-u-d-a-s

Dark Surreal Photo Manipulation

Corrupt by Taborda08

Dark Surreal Photo Manipulation

Intersection by robcherry

Dark Surreal Photo Manipulation

Artefact by Guivre1580

Dark Surreal Photo Manipulation

My Metallic Sonata by Wishmistress

Dark Surreal Photo Manipulation

What Do You Think?

I hope that you enjoyed this article. Did you have any favorite photo manipulations from this collection? Perhaps a favorite theme that jumped out at you within this genre? Let us know in the comments below and we can get a discussion going!

(rb)


  •   
  • Copyright © 1996-2010 BlogmyQuery - BMQ. All rights reserved.
    iDream theme by Templates Next | Powered by WordPress