Questions from Tim Haines, Part II

This is Part II in a two-part series. Part I covers the Trade Me application architecture.

Tim’s second lot of questions are about our dev tools and process:

Q: Any third party tools in the software or the dev/management process?

Q: What source control software do you use, and how do you use it?

Q: How do you manage roll outs? Dev/Staging/Live?

Q: Do you use pair programming, or adopt any other methodologies from the agile world?

The answers to these questions are just a snapshot, capturing how we do things today (early in April, 2007).

I go far enough back to remember when our “development environment” was Jessi’s PC (Jessi at that stage was our entire Customer Service department!) Back then there was no source control as such, we all shared a single set of source files. To deploy a change we would simply copy the relevant ASP files directly onto the live web server and then quickly make the associated database changes.

Somehow it worked!

Ever since then we’ve been constantly tweaking the tools and processes we use, to accommodate a growing team and a growing site. As our application and environment has evolved and become more complex our tools and process have had to change also.

This change will continue, I’m sure. So, it will be interesting to come back to this post in another 8 years and see if the things I describe below sound as ridiculous then as the things I described above do now.

Also, the standard disclaimer applies to these ideas: what makes sense for us, in our environment and with our site, may not make sense to you in yours. So, please apply your common sense.

Tools

Our developers use Visual Studio as their IDE and Visual SourceSafe for source control.

All of our .NET application code and all of our stored procedures are kept in a SourceSafe project. Developers tend to work in Visual Studio and use the integration with SourceSafe to check files in and out etc.

Thus far we’ve used an exclusive lock approach to source control. So, a developer will check out the file they need to make changes to and hold a lock over that file until the changes are deployed.

However, as the team gets bigger this approach has started to run into problems – for example, where multiple developers are working on changes together, or where larger changes need to be made causing key files to be blocked for longer periods.

To get around these issues, we’re increasingly working on local copies of files and only checking those files out and merging in their changes later. I imagine we will shortly switch to an edit-merge-commit approach, and that will require us to look again at alternative source control tools (e.g. SourceGear’s Vault, Microsoft’s Visual Studio Team System or perhaps Subversion – we’d be interested to hear from anybody who’s had experience with any of these).

Release Manager

At the centre of our dev + test process is a tool we’ve built ourselves called the ‘Release Manager’.

This incorporates a simple task management tool, where tasks can be described and assigned to individual developers and testers. It also hooks into source control, and allows a developer to associate source code changes with the task they are working on.

This group of files, which we call a ‘package’, may include ASPX files, VB class files as well as scripts to create or replace stored procedures in the database.

The tool also incorporates reports which help us track tasks as they progress through the dev + test process. These are built using SQL Reporting Services.

Environments

We have four environments:

  1. Dev: this includes a shared database instance and local web servers for each developer.
  2. Test: this includes a production-like database (actually databases, as we now have multiple instances in production) and a separate web server.
  3. Stage: our pre-production environment, again with it’s own web server
  4. Production: our live site, which actually incorporates two environments currently, one in Wellington and one in Auckland.

Developers typically work on changes individually. We have a code-review process, so any code changes have two sets of eyes over them before they hit test.

Once a code change is completed, the developer will create the package in Release Manager and set the task to be ‘ready to test’ so it appears on the radar of the test team.

We have a web-based deployment tool which testers can use to deploy one or more packages into the test environment. This involves some Nant build scripts which get the source files for the selected packages, copy these into the test environment and then build the .NET assemblies on the test server. The build script also executes any associated database changes that are included, and then updates the status of the package/s to ‘in test’.

The deploy tool is able to use the data from Release Manager to identify any dependencies between packages. Where dependencies exist we’re forced to deploy packages in a certain order, but in the general case we’re able to deploy packages independently of each other, which provides a great degree of flexibility and allows us to respond quickly where required (e.g. when there is an urgent bug fix required).

Production

Once a package has been tested the test team use the same deploy tool to move the package into the stage environment ready for go-live.

From there the responsibility switches to the platform team, who manage our production web servers. They have automated scripts, again built using Nant, which deploy from stage to our production environment/s. These scripts update configuration files then copy the required production files to the various web server locations. It also manages the execution of database scripts. The idea is to get everything as close to the brink as possible (which is the time consuming part of the deploy process) and then tip everything over the edge as quickly as possible, so as to minimise disruption to the site.

Typically we do two production releases each day, although this number varies (up and down) depending on the specific packages. In most cases these releases are done without taking the site offline.

The bigger picture

Our dev + test process is just one part of a much bigger product management process, which is roughly represented by the diagram below (click for a larger view):

Product Management Process

The other parts of this process are probably fodder for a separate post, but it’s important to note that there is a loop involved here.

Most of the changes we make to the site are influenced heavily by previous changes. In many cases very recent changes. This only works like it does because our process allows us to iterate around this loop quickly and often.

While we don’t follow any formal agile methodology, our process is definitely lightweight. We don’t produce lots of documentation, which is not to say that we don’t design changes up-front, just that we don’t spend too much time translating that thinking into large documents (it’s not uncommon for screen designs to be whiteboard printouts for example).

While we do make larger changes from time to time (for example, the DVD release which went out last week) the vast majority of changes we make are small and seemingly insignificant. Again, this only works because each of these small changes is able to flow through with minimal friction added by the tools and processes.

I’d also hate to give you the impression that this process is perfect. There is massive room for improvement. The challenge for us is to continue to look for these opportunities.

More?

That’s it for Tim’s questions. I hope some of that was useful?

If you have any other questions, ask away. You can either place a comment below or contact me directly. My email address is in the sidebar to the right.

Questions from Tim Haines, Part I

This is Part I in a two-part series. Part II covers the Trade Me development process and tools.

It’s been a while since we got geeky here, so …

After my recent post about our migration to ASP.NET I got sent a bunch of questions from Tim Haines. I thought I’d try and pick these off over a couple of posts.

To start with, a few questions about our application architecture:

Q: What’s the underlying architecture of Trade Me – presentation layer / business logic / data layer / stored procedures? All isolated on their own servers?

Q: Are there any patterns you find incredibly useful?

Q: Do you use an O/R mapper or code generator, or is all DB interaction highly tuned?

Q: What third party libraries do you use for the GUI? I see you have Dustin’s addEvent. Follow any particular philosophy or library for AJAX?

Here is a basic diagram I use to represent this application architecture we use on all of our sites at Trade Me (click for a larger view):

Application Architecture Diagram

We’ve worked hard to keep this application architecture simple.

There are two layers within the ASP.NET code + one in the database (the stored procedures). I’ll start at the bottom of the diagram above and work up.

Data Layer

All database interaction is via stored procedures. This makes it easier to secure the database to threats like SQL injection. And it also makes it easier to monitor/trace the performance of the SQL code and identify where tuning is required.

Within the application we manage all access to the database via the Data Access Layer (DAL).

All of the classes within the DAL inherit from a common base class, which is a custom class library we’ve created (based loosely on the Microsoft Data Access Application Block). This base class provides all of the standard plumbing required to interact with the database – managing connections, executing stored procedures and processing the results.

The methods within the DAL classes themselves specify the details of the database logic – specifying which stored procedure to call, managing parameters, validating inputs and processing outputs.

So, for example, to process a new bid we might implement this DAL method:

Public Sub ProcessBid(ByVal auctionId as Integer, ByVal bidAmount as Decimal)
	ExecuteNonQuery("spb_process_bid", _
		New SqlParameter("@auction_id", auctionId), _
		New SqlParameter("@bid_amount", bidAmount))
End Sub

A couple of things to note here:

  • All of our code is VB.NET, so that’s what I’ll use in these examples. Apologies to those of you who prefer curly brackets. Perhaps, try this VB.NET to C# converter ;-)
  • Obviously (hopefully!) this is not actual Trade Me code – just an example to demonstrate the ideas.

When we need to return data from the DAL we use Model classes. These are thin container classes which provide an abstraction from the data model used within the database and mean we don’t need to hold a database connection open while we process the data.

A simplistic Model class might look like this:

Public Class MemberSummary
	Public MemberId as Integer
	Public Name as String
End Class

Some Model classes use properties rather than exposing public member variables directly, and a few include functions and behaviours, but most are just a simple collection of public member variables.

Model classes are always instantiated within the DAL, never within the Web layer. We don’t pass Model objects as parameters (if you look closely at the diagram above you’ll notice the lines through the Model layer only goes upwards). This gives us an explicit interface into our DAL methods.

So, to get a list of members from the database we might implement this DAL method:

Public Function GetMemberSummaries() As IList(Of Model.MemberSummary)

	Dim list As New Generic.List(Of Model.MemberSummary)
	Dim dr As SqlDataReader = Nothing
	Try
		dr = ExecuteDataReader("spt_get_member_summary")
		While dr.Read()
			Dim item as New Model.MemberSummary
			item.MemberId = GetInteger(dr, "member_id")
			item.Name = GetString(dr, "name")
			list.Add(item)
		End While
	Finally
		If Not dr Is Nothing AndAlso Not dr.IsClosed Then
			dr.Close()
		End If
	End Try
	Return list
End Function

DAL methods are grouped into classes based on common functionality. This is an arbitrary split – in theory we only need 6 DAL classes (one class per connection string variation), but in practice we currently have 47.

The two examples above show the patterns that make up the vast majority of DAL methods.

While we don’t use an O/R mapper, we have created a simple tool, which we call DALCodeGen. Using this we can simply specify which proc to call and the tool generates the DAL method and, if appropriate, Model class. This code can then be pasted into the project and tweaked/tuned as required.

Web Layer

All the remaining application code sits in the Web layer. This is a mixture of business and presentation logic, which in part is a reflection of our ASP heritage.

During the migration we created controls to implement our standard page layout, such as the tabs and sidebar which appear on every Trade Me page. These were previously ASP #include files. We’ve also implemented controls for common display elements such as the list and gallery view used when displaying a list of items on the site.

We have a base page class structure. These classes implement a bunch of common methods – for example, security and session management (login etc), URL re-writing, common display methods, etc.

Most of the page specific display code is currently located in methods which sit in the code behind rather than in controls.

We don’t use the built-in post-back model – in fact ViewState is disabled in our web.config file and only enabled on a case-by-case basis as required (typically only on internal admin pages).

With the exception of addEvent we also don’t currently use any third-party AJAX or JavaScript libraries. To date none of the AJAX functionality we’ve implemented has required the complex behaviours included in these libraries, so we’ve been able to get away with rolling our own simple asynchronous post/call-back logic.

Layers vs. Tiers

Each of the yellow boxes in the diagram above is a project within the .NET solution, so is compiled into a separate .NET assembly. All three assemblies are deployed to the web servers and the stored procedures, obviously, live on the database servers. So, there are only two physical “tiers” within this architecture.

Inspirations

There is no such thing as an original idea.

Most of this design was inspired by the PetShop examples created by Microsoft as a means of comparing .NET to J2EE. These were pretty controversial – there was a lot of debate at the time about the fairness of the comparison. Putting the religious debate to one side, I thought the .NET implementation was a good example of an application designed with performance in mind, which was obviously important to us.

Another reference I found really useful when I first started thinking about this was ‘Application Architecture for .NET, Designing Applications & Services‘ which was published by the Patterns & Practices Group at Microsoft. This is still available, although likely now out of date with the release of ASP.NET 2.0. It’s also important to realise that this book is intended to describe all of the various aspects that you might include in your architecture. Don’t treat it as a shopping list – just pick out the bits that apply to your situation.

Disclaimer

I’m a little reluctant to write in detail about how we do things. I’d hate to end up in the middle of a debate about the “right way” to design or architect an application.

Should you follow our lead? Possibly. Possibly not.

I can say this: if somebody has sent you a link to this saying “look, this is how Trade Me does it … it must be right” they are most likely wrong. You should at least make sure they have other supporting reasons for the approach they are proposing.

A lot of our design decisions are driven by performance considerations, given our size and traffic levels. These constraints probably won’t apply to you.

In other cases we choose our approach based on the needs of the dev team. We currently have 8 developers and ensuring that they can work quickly without getting in each others way too much is important. Smaller or larger teams may choose a different approach.

Also, a lot of our code still reflects the fact that it was recently migrated from an ASP code base. If you’re creating an application from scratch you might choose to take advantage of some of the newer language features which we don’t use.

More?

I hope some of that is useful? If you have any other questions send them through – my email address is in the sidebar to the right.

.NET usage on the client

Nic commented on my recent server stats post asking if we have any stats on the % usage of .NET on the client. As he pointed out the CLR version number is included in the IE user agent string.

I took a sample of 70,000 IE users from recent server logs and these are the results:

.NET CLR version Usage
None 43.9%
1.0.3750 6.7%
1.1.4233 50.8%
2.0.5072 12.5%
3.0.* 1.7%

In case you’re wondering why these percentages add up to more than 100%, it is possible to install multiple versions of the runtime side-by-side. In total 56.1% of people have one or more version installed.

This would give me significant pause if I was developing a client-side application which depends on the runtime being installed.

That’s the beauty of a web application I suppose.

Source data: http://spreadsheets.google.com/pub?key=p03Pw5UOTJJ425das60qoLA

UPDATE (30-Jan): Picking up on Nigel’s comment, I’ve updated the table above to include version 3.0. This number includes both 3.0.04320 (beta) and 3.0.04506.

ASP.NET 2.0

Last week we deployed Trade Me as an ASP.NET 2.0 application. We switched over early on Tuesday morning without even taking the site offline. With luck, nobody noticed. Nonetheless, this is an exciting milestone.

Eighteen months ago all four sites (Trade Me, FindSomeone, Old Friends & SafeTrader) were built using classic ASP, which was starting to show its age. We’ve been working off-and-on since then to bring this code base up-to-date. Most of the heavy lifting was actually done this time last year, when we took the opportunity over the quiet Christmas/New Year period to make a start on the main Trade Me site – taking it from ASP to ASP.NET 1.1.

The opportunity to work on this project was a big part of my motivation for heading home from the UK in 2004. It’s great to reach a point where we can reflect on the huge benefits it has realised, not the least being that we’ve been able to complete this work on our own terms. It’s an awesome credit to the team of people who have made it happen.

Our motivation

I’m pretty proud of the approach we’ve taken. To understand this you really need to understand the motivation for the change in the first place.

In 2004 there were a number of unanswered questions:

How much further could we push ASP before performance was impacted?

Back then, we were actually pretty happy with the performance of our ASP code. It had been tweaked and tuned a lot over the years. We’d ended up building our own ASP versions of a number of the technologies included in ASP.NET, such as caching.

The interaction between ASP and the database, which was (and is!) at the heart of the performance of the site, was pretty carefully managed. For example, we were careful not to keep connections open any longer than absolutely required, etc, etc.

At the application layer we had managed growth by adding more web servers. But, this was something we could only push so far before it would start to create problems for us in other places, most importantly in the database.

While we had confidence that we could continue to work with ASP, that wasn’t necessarily shared by everybody else.

Which lead us to the next problem …

How could we continue to attract the best developers to work with us?

It’s hard to believe now that we managed as well as we did without many of the tools and language features that we now take for granted: compiled code, a debugger, a solution which groups together all of the various bits of code, source control to hold this all together, an automated build/deploy process, … the list goes on.

For obvious reasons, we were finding it increasingly difficult to get top developers excited about working with us on an old ASP application.

And there was lots of work to do. As always seems to be the case, there was a seemingly infinite list of functional changes we wanted to make to the site.

So, that left us with the question that had been the major stumbling block to addressing these problems earlier …

How could we make this change without disrupting the vital on-going development of the site?

Looking at the code we had, it was hard to get excited about improving it, and hard to even know where to start. There was a massive temptation to throw it all out and start again.

But, inspired by Joel Spolsky and the ideas he wrote about in Things you should never do, Part I we decided to take the exact opposite approach.

Rebuild the ship at sea

Rather than re-write code we chose to migrate it, one page at a time, one line at a time.

This meant that all of the special cases which had been hacked and patched into the code over the years (which Joel calls “hairy” code) were also migrated, saving us a lot of hassle in re-learning those lessons.

The downside was that we weren’t able to fix all of the places where the design of the existing code was a bit “clunky” (to use a well understood technical term!) We had to satisfy ourselves in those cases with “better rather than perfect”. As it turned out, none of these really hurt us, and in fact we’ve been able to address many of them already. Once the code was migrated we found ourselves in a much stronger position to fix them with confidence.

Because we had so much existing VBScript code we migrated to VB.NET rather than C# or Java or Ruby. This minimised the amount of code change required (we enforce explicit and strict typing in the compiler, so there was a fair amount of work to do to get some of the code up to those standards, but that would have been required in any case).

We kept the migration work separate from the on-going site work. When migrating we didn’t add new features and we didn’t make database changes. When we were working on site changes we made them to the existing code, leaving it as ASP if necessary, rather than trying to migrate the code at the same time.

We focussed on specific things that we could clean-up in the code as part of the migration process. For example, we added an XHTML DOCTYPE to all pages and fixed any validation errors this highlighted. We moved all database code into separate classes. And, we created controls for common UI elements (in most cases replacing existing ASP include files). We also removed any code which was no longer being used, including entire “deadwood” pages which were no longer referenced.

To build confidence in this approach we started with our smaller sites: first SafeTrader and Old Friends followed by FindSomeone then finally Trade Me.

After each site was migrated we updated our plans based on what we’d learnt. The idea was to try and “learn early” where possible. For example, after the Old Friends migration we realised we would need a better solution for managing session data between ASP and ASP.NET, so we used the FindSomeone migration as a test of the solution we eventually used with Trade Me. The performance testing we did as part of the early migrations gave us confidence when it came time to migrate the larger sites.

We re-estimated as we went. By keeping track of how long it was taking to migrate each page we got accurate metrics which we fed into future estimates.

Finally, we created a bunch of new tools to support our changing dev process. For example, we created a simple tool we call “Release Manager” which hooks into our source control and is used to combine various code changes into packages which can then be deployed independently of our test and production environments. We created an automated process, using Nant, which manages our build and deploy, including associated database changes. More recently we implemented automated code quality tests and reports using FxCop, Nunit and Ncover. All of these mean that, for the first time, we can work on keeping the application itself in good shape as we implement new features.

The results

This has been an exciting transformation. The migration was completed in good time, without major impact on the on-going development of the site – we made it look easy! We added four new developers to the team, all with prior .NET experience, and we got all of our existing dev team members involved in the project, giving them an opportunity to learn in the process. Having more people, along with process improvements and better tools, has enabled us to complete a lot more site improvements. We’re in good shape to tackle even more in the year ahead. We’ve even been pleasantly surprised by the positive impact on our platform, which has allowed us to reduce the number of web servers we use (there are more details in the Microsoft Case Study from mid last year, if you’re interested in this stuff).

As is the nature of this sort of change, we’ll never really finish. With the migration completed we’ve started to think about the next logical set of improvements. It will be exciting to see how much better we can make it.

If you’re interested in being part of this story, we’re always keen to hear from enthusiastic .NET developers. Send your CV to careers@trademe.co.nz.