Why is it important for companies to focus on performance?

Many people particularly developers find it pretty hard to justify the costs associated to increase performance of their applications. In an earlier blog post we had tried to address that briefly.

But people still find it difficult to justify the ROI of such an effort. So, why is it important for companies to focus on performance?

Steve Souders in O'Reilly blog says real world numbers show the benefits of making your site faster!

Some quick facts:
  • Bing found that a 2 second slowdown caused a 4.3% reduction in revenue/user
  • Google Search found that a 400 millisecond delay resulted in 0.59% fewer searches/user
  • AOL revealed that users that experience the fastest page load times view 50% more pages/visit than users experiencing the slowest page load times
  • Shopzilla undertook a massive performance redesign reducing page load times from ~7 seconds to ~2 seconds, with a corresponding 7-12% increase in revenue and 50% reduction in hardware costs

Most engineers are excited about the prospect of improving performance of their websites but to do that they need to justify the effort needed with the non-engineering parts of the organization. Hope these numbers and links help you in justifying or making your case that much more stronger.

Bookmark/Share this post with:
Bookmark and Share

Read more!

Column Oriented Databases

Continuing our coverage of the disadvantages of Relational Databases, RDBMS design is not read optimized enough for high performance applications or OLAP kind of applications, where aggregates are computed over large numbers of similar data items.

This is where column oriented DBMS can help.

The biggest bottleneck with a DB query in a reporting scenario is its Disk Read Time. Using Column Oriented DBMS will attack this problem by reducing the disk read times drastically in most scenarios. Allow me to explain...

Traditionally Relational Database design has been based on rows. We developers are so used to it and hence can visualize it without effort. Records of an employee in a typical row based database is as shown below:

101 Aravind 27
102 Mike 25

This table will be stored in a disk as 101;Aravind;27&&102;Mike;25. A column oriented implementation of the same table would be persisted as 101;102&&Aravind;Mike&&27;25.

When the query is to find the average age from the table, much fewer disk reads are needed to get the ages of all the employees from a column oriented implementation as all the ages are stored almost sequentially.

While the RDBMS favors queries which require fetching all data of a given row, the Column Oriented DB implementation favors queries which require aggregates of a specific column. Examples include a count of all users of age less than 30,

The Wikipedia page has a good list of column Oriented DB implementations.

Bookmark/Share this post with:
Bookmark and Share

Read more!

Microsoft Photo Syth - ecBook on Steroids?

Developed as a research project within Live Labs — Microsoft's applied research arm — Photosynth automatically stitches together digital photographs to create a somewhat abstract but high-resolution three-dimensional re-creation (called a synth) for the world to explore.

"Photosynth creates an amazing new experience with nothing more than a bunch of photos. Creating a synth allows you to share the places and things you love using the cinematic quality of a movie, the control of a video game, and the mind-blowing detail of the real world."

Its very hard to explain what PhotoSynth really is; You have to see it for yourself.

The main selling point for this project is that it can work today - get pictures that you take through your existing camera - to create a desired user experience, a nice 3D view that you can use to explore the subject further.

Scenarios like Exploring a historical monument/sculptor, or investigating a crime scene, or taking a virtual walk around the place you had been for your honeymoon, or even taking a tour around your friend whom you physically miss, becomes instantly possible!

Straight out of science fiction? No!

Where does this all leave ecBook? Well ecBook solves a different set of user/enterprise problems altogether - like moving your offline print catalog online & dynamic. But ecBook could learn a thing or two about visualization from PhotoSynth and bring that to the enterprise world.

Bookmark and Share

Read more!

Learn how to create High Performance Websites

This blog "High Performance Websites" is about creating really fast websites. The author Steve Souders is an acclaimed expert in the field of performance and is a web performance evangelist at Google and former Chief Performance Yahoo!.

Steve is the creator of YSlow, the performance analysis extension to Firebug. He is also co-chair of Velocity 2008, the first web performance conference sponsored by O'Reilly. He frequently speaks at such conferences as OSCON, Rich Web Experience, Web 2.0 Expo, and The Ajax Experience.

Steve previously worked at Yahoo! as the Chief Performance Yahoo!, where he blogged about web performance on Yahoo! Developer Network. He was named a Yahoo! Superstar. Steve worked on many of the platforms and products within the company, including running the development team for My Yahoo!

The blog posts covers strategies like 'Sharding Dominant Domains' that see the webpages from the a browsers' perspective and lucidly explains how each strategy impacts the load time.

Other Blog Posts like 'Using Iframes Sparingly' and 'Flushing the Document Early' gives practical tips to performance optimize websites. They are not your normal optimization strategies that you read at most other places like use smaller/lesser images, usage of Html/CSS compression, etc.

You might also want to check-out his books High Performance Web Sites and Even Faster Web Sites which are bestsellers. Also his lectures at Stanford University, a wealth of knowledge, can be had here at a cost (a few videos are free).

Get benefited and learn from the master himself! You will get more blog introductions in the weeks to follow, so check back on us; Here are some ways to do it.

Bookmark/Share this post with:
Bookmark and Share

Read more!

Google announced its new OS "Google Chrome OS"

Happy news for all Google lovers, Google is releasing Google Chrome OS in the second half of 2010.

"Google Chrome OS is attempt to re-think what operating systems should be."

Google Chrome OS is an open source, lightweight operating system that will initially be targeted at netbooks. Later this year we will open-source its code, and netbooks running Google Chrome OS will be available for consumers in the second half of 2010. Because google is already talking to partners about the project, and google will soon be working with the open source community, wanted to share there vision now so everyone understands what we are trying to achieve.

Google Chrome OS is a new project, separate from Android. Chrome OS is being created for people "who spend most of their time on the web" according to Sundar Pichai, VP Product Management at Google.

Google Chrome OS will run on both x86 as well as ARM chips and there are working with multiple OEMs to bring a number of netbooks to market next year. The software architecture is simple — Google Chrome running within a new windowing system on top of a Linux kernel. For application developers, the web is the platform. All web-based applications will automatically work and new applications can be written using your favorite web technologies. And of course, these apps will run not only on Google Chrome OS, but on any standards-based browser on Windows, Mac and Linux thereby giving developers the largest user base of any platform.

Eagerly waiting for Google Chrome OS !!!

Bookmark/Share this post with:
Bookmark and Share

Read more!

Whats wrong with RDBMS & its ACID properties?

My last blog post claimed the downfall of the Relational Databases. Most Database Management systems in the market promise ACID properties out of the box. ACID stands for
  • Atomic: A transaction as a whole atomic unit succeeds or fails.
  • Consistent: A transaction cannot leave the database in an inconsistent state.
  • Isolated: Transactions cannot interfere with each other.
  • Durable: Completed transactions persist, even when servers restart etc.
At the outset these properties seem indisputable and all databases seem to have it; atleast that is what we all have been believing for so long!

But todays' highly scalable internet applications have different set of demands that is at odds with these ACID properties. For example,
  • distributing a database to more than one machine becomes inevitable as your website gets more and more traffic; It becomes impossible to get an acceptable response time given problems like network delays & failures.
  • Scaling-up one database node endlessly is an impossible task because hardware failure can never be ruled out(atleast for now). Downtime is simply unacceptable.
  • providing caching mechanisms will be conflicting with the 'Consistent' property, due to the existence of a possibility that the database can be updated without the cache updates getting completed.
This conflict is explained in a simple way by Brewer's CAP Theorem, which says that in a distributed system(which most internet scale applications are) if you want consistency, availability, and partition tolerance, you have to settle for two out of these three properties. Find a bit more detailed discussion on the CAP theorem here.

CAP theoram says that ONLY ACID properties will have problems with scalable distributed systems; Thankfully, we have an alternative called BASE!
BASE is an acronym for
  • Basic Availability
  • Soft-state
  • Eventual consistency
Rather than requiring consistency after every transaction, it is enough for the database to eventually be in a consistent state. (Accounting systems do this all the time. It’s called “closing out the books.”) It’s OK to use stale data, and it’s OK to give approximate answers.

It’s harder to develop software in the fault-tolerant BASE world compared to the fastidious ACID world, but Brewer’s CAP theorem says you have no choice if you want to scale up. However, as Brewer points out in this presentation, there is a continuum between ACID and BASE. You can decide how close you want to be to one end of the continuum or the other according to your priorities.

Source: http://www.johndcook.com/blog/2009/07/06/brewer-cap-theorem-base/

Bookmark/Share this post with:
Bookmark and Share

Read more!

Is RDBMS, as we know it, dead?

RDBMS, that served the software development world, so well for so long, could soon get killed?

Atleast a bunch of like minded nerds who met at San Fransicso, seem to think so! A debrief of the conference, slide decks/recordings etc. can be had here.

Their argument is that "Relational databases give you too much. They force you to twist your object data to fit a RDBMS". NoSQL-based alternatives "just give you what you need"

By sidestepping the time-consuming toil of translating apps and data into a SQL-friendly format, NoSQL architectures perform much faster, they say.

"SQL is an awkward fit for procedural code, and almost all code is procedural," For data upon which users expect to do heavy, repeated manipulations, the cost of mapping data into SQL is "well worth paying ... But when your database structure is very, very simple, SQL may not seem that beneficial."

Facebook, for instance, created its Cassandra data store to power a new search feature on its Web site rather than use its exisiting database, MySQL. According to a presentation by Facebook engineer Avinash Lakshman (PDF document), Cassandra can write to a data store taking up 50GB on disk in just 0.12 milliseconds, more than 2,500 times faster than MySQL.

Amazon.com's CTO, Werner Vogels, refers to the company's influential Dynamo system as a "highly available key-value store." Google calls its BigTable, the other role model for many NoSQL adherents, a "distributed storage system for managing structured data."

Hypertable, an open-source column-based database modeled upon BigTable, is used by local search engine Zvents Inc. to write 1 billion cells of data per day, according to a presentation by Doug Judd (PDF document), a Zvents engineer.

Encouraging? Think again...

"Most large enterprises have an established way of doing OLTP [online transaction processing], probably via relational database management systems. Why change?" MapReduce and similar BI-oriented projects "may be useful for enterprises. But where it is, it probably should be integrated into an analytic DBMS [database management system.]"

Because they are open source, NoSQL alternatives lack vendors offering formal support. That's definitely, no deal breaker for many of us!

Source: ComputerWorld

Bookmark/Share this post with:
Bookmark and Share

Read more!