|
Mobile - Web - Media
Monday, Feb 19, 2007 2:09:18 PM
Amazon's Web Services Offer Upscaling and Redundancy Over the last month, I've turned my development focus away from new features, and have targeted performance and scaling issues as my number one priority.
I've been rewriting core parts of the application that generates ArtistServer, simplifying and optimizing it through object oriented design methods, and shaving 10-1,500ms from pages on the site. While that may seem minimal, when you multiply it by 1,000's of users making requests, those milliseconds add up!
Scaling, is being able to support a growing population of users, data and media, while sustaining an acceptable performance for it's users. Doing this usually requires some initial investment in hardware, so you stay ahead of the curve. Thanks to Amazon Web Services, you and I can now ride the growth curve in real time.
Amazon offers several Web Services which could greatly change your business by saving you money, allowing you to grow on less initial investment, and potentially open your business up to new opportunities. In my case, I'm beginning to see how I'll be able to continue growing my business, minimize the risks, improve quality, and save money while freeing my time up for creating and exploring new ideas.
Amazon Web Services Home
Are you aware of these services?
- Elastic Compute Cloud (Amazon EC2) - is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers.
Just as Amazon Simple Storage Service (Amazon S3) enables storage in the cloud, Amazon EC2 enables "compute" in the cloud. Amazon EC2's simple web service interface allows you to obtain and configure capacity with minimal friction. It provides you with complete control of your computing resources and lets you run on Amazon's proven computing environment. Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing requirements change. Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use.
- Mechanical Turk - provides a web services API for computers to integrate Artificial Artificial Intelligence directly into their processing by making requests of humans. Developers use the Amazon Mechanical Turk web services API to submit tasks to the Amazon Mechanical Turk web site, approve completed tasks, and incorporate the answers into their software applications. To the application, the transaction looks very much like any remote procedure call - the application sends the request, and the service returns the results. In reality, a network of humans fuels this Artificial Artificial Intelligence by coming to the web site, searching for and completing tasks, and receiving payment for their work.
- Simple Storage Service (S3) - S3 is storage for the Internet. It is designed to make web-scale computing easier for developers.
Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.
- Simple Queue Service (Amazon SQS) - offers a reliable, highly scalable hosted queue for storing messages as they travel between computers. By using Amazon SQS, developers can simply move data between distributed application components performing different tasks, without losing messages or requiring each component to be always available.
Amazon SQS works by exposing Amazon's web-scale messaging infrastructure as a web service. Any computer on the Internet can add or read messages without any installed software or special firewall configurations. Components of applications using Amazon SQS can run independently, and do not need to be on the same network, developed with the same technologies, or running at the same time.
I suggest spending some time at the Amazon Web Services site and read up on each of their services, then brainstorm on how you could use these services in your business or site.
I'm currently working on an integration with Amazon's S3 service, and once that's complete, I'll most likely look at integrating EC2 and SQS into my application. Below, is a diagram showing how a developer could add Amazon S3 (Simple Storage Solution) to their Web application and gain not only a storage solution that scales, but also data redundancy, allowing you to toggle between local or remote versions of files. The diagram is meant to give you a general idea as to how you could integrate S3, there's definitely room for variations and additional optimization. For example, instead of the using a local page for your queue, you could use Amazon's SQS, or your own queueing server with a 'ticket system' for managing tasks.
The diagram has two sections, the first shows files being uploaded to your site/service. The second section shows the serving of those files. The main idea here, is that files are stored on your server, then copied out to your S3 account. By integrating a variable for tracking the status of the copy out on S3, you can serve either the local or remote copy. Since S3 can scale more than your server, you would use the S3 copy as the primary version, and your local copy as the backup. If for some reason S3 goes offline, you can user an application level variable to force your application to use the local copies. The 'code' you see in the diagram is 'pseudo code' and is only for demonstration purposes.
If image is not displayed, you may access it here: http://photo.artistserver.com/1/8/D94D29F7-10DC-3A80-B070C1CF742F7D63_O.jpg
BTW - If this kind of thing interests you, I suggest that you pick up Cal Henderson's book "Building Scalable Web Sites" published by O'Reilly.
Amazon
redundancy
S3
scale
webapps
- ADD TO:
-
Blink
-
Del.icio.us
-
Digg
-
Furl
-
Google
-
Simpy
-
Spurl
-
Y! MyWeb
ArtistServer
Sunday, Feb 11, 2007 4:57:30 PM
ArtistServer Adds More Support for Microformats!
This weekend, I rolled out some updates to ArtistServer.com - expanding our support of Microformats, to include xfn and hcard. Below is an outline of all the
Microformats ArtistServer currently supports.
Learn more about Microformats here: http://microformats.org
- Microformat: xfn
URL: http://gmpg.org/xfn/
Where: xfn is now used on each artist and member's Friend page on ArtistServer.
About: XFN (XHTML Friends Network) is a simple way to represent human relationships using hyperlinks. In recent years, blogs and blogrolls have become a rapidly growing area of the Web. XFN enables web authors to indicate their relationship(s) to the people in their blogrolls simply by adding a 'rel' attribute to their links.
-
Microformat: hcard
URL: http://microformats.org/wiki/hcard
Where: At the bottom of each artist and member's homepage on ArtistServer.
About: hCard is a simple, open, distributed format for representing people, companies, organizations, and places, using a 1:1 representation of the properties and values of the vCard standard (RFC2426 (http://www.ietf.org/rfc/rfc2426.txt)) in semantic XHTML. hCard is one of several open microformat standards suitable for embedding in (X)HTML, Atom, RSS, and arbitrary XML.
-
Microformat: rel-license
URL: http://microformats.org/wiki/rel-license
Where: Each song on the site carries a rel-license microformat on ArtistServer.
About: Rel-License is a simple, open, format for indicating content licenses which is embedable in (X)HTML, Atom, RSS, and arbitrary XML.
-
Microformat: rel-tag
URL: http://microformats.org/wiki/rel-tag
Where: Everywhere you see tags being used, the rel-tag microformat is assigned to the link.
About: By adding rel="tag" to a hyperlink, a page indicates that the destination of that hyperlink is an author-designated "tag" (or keyword/subject) for the current page. Note that a tag may just refer to a major portion of the current page (i.e. a blog post). e.g. by placing this link on a page.
Draft Formats: The following are draft formats which ArtistServer currently supports.
- Microformat: geo
URL: http://microformats.org/wiki/geo
Where: Inside each hCard, ArtistServer outputs the artist or member's geocoordinates if they are known.
About: geo is a simple format for marking up geographic latitude longitude information, suitable for embedding in (X)HTML, Atom, RSS, and arbitrary XML. geo is a 1:1 representation of the "geo" property in the vCard standard (RFC2426 (http://www.ietf.org/rfc/rfc2426.txt)) in XHTML, one of several open microformat standards.
- Microformat: enclosure
URL: http://microformats.org/wiki/rel-enclosure
Where: Each mp3 download link on the site carries an enclosure microformat on ArtistServer.
About: RelEnclosure is a simple, open, format for indicating files to cache which is embeddable in (X)HTML, Atom, RSS, and arbitrary XML.
- Microformat: adr
URL: http://microformats.org/wiki/adr
Where: Inside each hCard, ArtistServer outputs the artist or member's City, State and Country.
About: adr is a simple format for marking up address information, suitable for embedding in (X)HTML, Atom, RSS, and arbitrary XML. adr is a 1:1 representation of the adr property in the vCard standard (RFC2426 (http://www.ietf.org/rfc/rfc2426.txt)) in XHTML, one of several open microformat standards. It is also a property of hCard.
ArtistServer
hcard
microformats
xfn
- ADD TO:
-
Blink
-
Del.icio.us
-
Digg
-
Furl
-
Google
-
Simpy
-
Spurl
-
Y! MyWeb
Mobile - Web - Media
Friday, Feb 09, 2007 11:56:24 AM
The Startup Dance, It's Just a Jump to the Left...
The last few days, I've been thinking about the balance I'm always dealing with in terms of taking on contract programming jobs to pay the bills, and working on launching a startup.
It was easier a few years ago, when I had one main client that covered all my bills. The rest of my time was spent learning, researching, and developing the platform ArtistServer.com runs on. Those days are gone, and I'm now back to the 'month-to-month' lifestyle of a contract programmer. Most projects only last a few weeks, and with the huge amount of competition out there, I've had to drop my hourly rate each year for the last 3 years. This translates to having to find 1-2 new development contracts each month just to get by.
Below, you'll find a passage I wrote this morning about the balance between working to survive, and working to get a startup launched.
The startup dance is a series of actions and movements whereby you juggle your current means of income with an attempt to execute a more attractive opportunity.
The startup dance is a relationship, where current work supports your startup.
The startup dance is a war, where current work and your startup do battle for the rights to your future.
The startup dance happens everyday for 12-16hrs each day.
The startup dance reveals your weaknesses and sharpens your skills.
The startup dance is about doing more with less.
The startup dance is a competition with all the other startup dancers.
The startup dance is best done with a small team that dances well together.
Yes, it's not very poetic, but I do feel it shows several of the dynamics and conditions a person must deal with when trying to get a startup going.
Why do it?
- Achieve Dreams - always have a dream, and always work toward it
- Learn and Grow - you'll fail more often than succeed, and each failure is an opportunity to learn
- Spread Ideas - ideas can change the world
- Challenge Yourself - find out who you really are
- Expose Yourself to More Opportunities - life is about opportunities for experiences and the perspectives you gain along the way
Have a great weekend :)
ArtistServer
startup
- ADD TO:
-
Blink
-
Del.icio.us
-
Digg
-
Furl
-
Google
-
Simpy
-
Spurl
-
Y! MyWeb
Mobile - Web - Media
Saturday, Jan 27, 2007 11:10:19 AM
MyBlogLog - Claim ANY Site or Blog as Your Own
It appears that MyBlogLog allows anyone to claim a site as their own. All you have to do is join, fill out a form, type in a URL - and the site is yours on the MyBlogLog system.
The user does not have to prove the site is theirs. They don't have to download and install a small html page with a unique ID that MyBlogLog could then request and thus prove that the person owns that site. Once they claim the site as theirs - it's locked, and the actual owner can't add the site to their own profile.
This isn't bad planning on MyBlogLog's part... it's terrible.
I found this out today because a user on my site has claimed my company site as his own on MyBlogLog. I'd post the link, but I don't want to cause any troubles w/ the actual artist - it could very well have been a mistake on his part. BUT - this is something MyBlogLog should not allow - just as other sites like Technorati don't provide you a means to claim other peoples sites and blogs as your own.
I just sent him an email asking him to remove it - or rename it, but as you know - the user may not respond, and he may not remove it. I also sent an email to MyBLogLog asking for help.
I find it odd that they are allowing people to claim websites as their own without providing any means to actually prove that it's their site. Many services will give you a piece of code to save onto your site which the service would then request from the site, which then proves that the site is that persons. MyBlogLog isn't asking anyone to prove anything - which sounds to me like they are creating a huge problem down the road as Spammers and traffic hungry bloggers claim sites they don't own nor run. How will they deal with this? Don't they realize it's going to rapidly transform them into a cesspool?
Give it a go - claim every corporation you can think of, load 'em up - they're yours for the taking... actually, no, don't do that. Let's wait and see if they will fix this and add a reliable means for claiming sites - and if they don't - live it up, because as soon as someone starts claiming all the larger corps, Web Startups, etc - larger voices will call out to MyBlogLog demanding a fix. But remember, it's best to try to work any problem out with conversation - so let the conversation begin! I sure hope MyBlogLog improves their system and makes it more secure - claiming a site/url/blog as your own MUST have proof - don't you think? And yes - I'm a member: http://www.mybloglog.com/buzz/members/gideonmarken/
Here is a screenshot from the MyBlogLog site showing their "Add site" form - displaying that someone else "authors" my company site. Nice! http://www.artistserver.com/m1/8/9/media/21486.jpg 
blogs
MyBlogLog
- ADD TO:
-
Blink
-
Del.icio.us
-
Digg
-
Furl
-
Google
-
Simpy
-
Spurl
-
Y! MyWeb
Mobile - Web - Media
Thursday, Jan 25, 2007 2:07:43 PM
Optimizing and Scaling Web Apps After reading about MySpace dealt with their scaling up to support millions of users, I decided it was a good idea to review my application's architechture, and identify what areas could scale independently, and what areas need optimization.
I first outlined the application that generates ArtistServer into seven parts - there's opportunity for optimization everywhere, so by looking at the applicaiton in different ways, I feel you can better identify those optimization points.
- Application Zones - the site divided into zones of functionality
- Sub-Applications - areas of the application which can run on separate servers and scale independently from the main site
- Flat File Publshing/Caching - writing of content and charts to files on the server to cut down on database access
- Site Data Methods - listing the interaction with the database by object and method
- Views - listing of all reusable views in the application, like 'song' and 'member'
- Widgets (future) - listing of planned widgets
- API (future) - outline of API plans
I then outlined the application into 25 Application Zones:
- Homepage
- Artist/Member sites
- Music/Ringtone/Area/Genre pages
- Serving Mp3s/Uploading mp3s
- RSS
- My Account Admin
- Artist Member Files
- Photos Area
- Charts
- Serving/Resizing Photos
- Stations/Playlists
- Store
- Site Skinning
- CSS
- About/Info pages
- Blogs
- Tags
- Favorites
- Reviews/Comments
- Widgets
- Error system/pages
- Stats/Tracking
- Site Admin
- Wiki
- Forums
For each zone, I wrote what I felt was wrong, what could be done better, and what could run as a sub-application on it's own server. When I use the term sub-application, what I mean, is to take the zone or area that has specific functionality, and modify it so that it can run either on the same server as the main site, or run on a separate server where it can have it's own resources and potentially scale on it's own.
For example, the following are already functioning as sub-applications:
- Database Server - database is hosted on it's own server
- Forums - hosted on a separate server using forums.artistserver.com
- Photos - photo uploads, resizing, and serving is done on a separate server using photos.artistserver.com
- MP3 Streams/Downloads- all mp3s are served from media.artistserver.com, which runs on the same server as the main site, but is ready to support a move to it's own server, or even a streaming provider
After my critical analysis, I found there were three zones I should consider modifying, turning them into sub-applications. They were: stats/tracking, artist/member's files, and RSS. As an example, I'll explain what is being done to the stats and tracking in the application.
Stats and Tracking
The application tracks data on artist/member sites, but only those who've upgraded, and for all artists, we track all streams and downloads to their music. Actually, we filter out repeat streams in a 24hr period by the same IP to keep people from gaming the charts.
All the calls
to the stats tables were something that could happen asynchronously. There wasn't a need for the processes that were being tracked to have to wait for the tracking process to complete before completing themselves. For example, you have an artist page, and starting at the top, you would have a query to the database to return that person's record, and another query that tracks the access to the page, another to return their songs, etc. In this case, each query is happening in sequence, which means one has to complete before the other can execute.
Since the capturing of the stats data and writing to the database isn't going to be displayed on the page nor get used in any other portion of the code, it's a prime candidate for becoming an asynchronous process.
After I wrote a new stats object, I set it up on a separate server from the main application server (so my asynchronous stats calls don't compete for resources on the main server), I replaced my stats script on the artist and member pages with a call to the AsyncHTTP object by Compound Theory
(a ColdFusion CFC that uses Java), ran some tests, then took it live. Now when the page executes, and the AsyncHTTP object calls my stats server, the page continues to process without waiting, shaving off several milliseconds from every request to an artist or member page.
The same thing will be done with the mp3 tracking on the site, which as you can imagine, should make the process of requesting an mp3 for streaming and/or download faster for everyone.
After that, I'll break the stats tables out into their own database, so the main database can be
free to handle the application, and the stats database can grow independently. Initially, I'll still run the database on the same database server, but the point is, that at any time, that's a viable scaling option - the stats could easily be setup on it's own database server or server cluster.
Sounds like fun eh? :)
Objectize, Optimize, Modularize, Asynchronize
Create and extend objects. Optimize at all levels. Think modularly. Asynchoronize background processes.
After completing changes to the three zones I identified as potential sub-applicaitons, I'll go back over my outline and analysis, and start working on addressing the issues I've listed. While I'm sure there we will still have growing pains as we scale up, I feel the timeinvested now not only delays that experience, but also prepares us for that day. Another plus from this process, is that more of the application is getting documented, something that rarely gets done with Web projects.
ArtistServer
asynchronous
optimizing
- ADD TO:
-
Blink
-
Del.icio.us
-
Digg
-
Furl
-
Google
-
Simpy
-
Spurl
-
Y! MyWeb
|
 |