ashughes@Mozilla: Litmus - An Analytical Approach

As promised, I am writing about my experience with Litmus this week and my findings. I have heard mention that there is some worry about the turn-out for this weeks testday. So I decided to go and look at the numbers.

Upon first glance, these feelings seemed to be completely unfounded. Especially since I had completed 197 tests on my own. So I decided to conduct a little research on my own.

I took my time and queried every test day completed since August 2005, in search of the number of testers and the number of results in a given test day. Having found these numbers, I plugged them into a spreadsheet and analyzed the numbers. To make it easier to see, I also created three charts. One for the results of the tests for each testday, another for the number of testers for each test day, and a final chart displaying the ratio of tests complete to testers. This last chart is used to "score" how well that test day went.

Before I go into my findings I would like to show you copies of the charts for your reference:

Number of Results Per Testday

Number of Testers Per Testday

Testday "Score" - Results per Tester

After going over these numbers, I can theorize the following:

Thunderbird tests have always had a low turnout when compared to Firefox tests. Since 2007-06-08 was the first time Firefox and Thunderbird were separated into their own test-days and held during the same time, we are seeing the results more accurately than before.
Our highest results for number of testers always falls during school semesters, so the fact that school is out in the summer could be contributing to low numbers as well
It was mentioned that IRC traffic was low during testday. In my opinion, IRC traffic is not a good indicator of a successful test day. A low amount of traffic tells me that there were few issues, which is a good thing.
Looking at the charts as they stand, it is tough to extract any particular pattern.
It should be noted that all this does it show that the current situation isnt on a big decline. We should still try to come up with ideas on how to promote the testdays better to get even more exposure.
The days where we had 30+ user participation were Seneca test days.

It is my hope that this will become a good initial dataset to help with deciding how to draw more people to the testdays.

I think the best line of defense is to get out to the schools more. Not just the colleges and universities either. Perhaps try to get to graduating high school students entering into this field.

As a sidebar, I have a couple ideas for Litmus and Bugzilla that I would like to air.

First off, the test cases in Litmus need to be cleaned up. Often when I run a Litmus test on a specific platform, there will be tests that say "mac only" or "linux only" when I am running windows tests (for example). This means that I either have to go run that test on the other platform so I can mark the test as "pass" just to get my 100% coverage, or I have to leave it as "not run" never achieving 100% coverage.

Secondly, Litmus should be able to recognize multiple build ids for one program. For example, every night there is a nightly of Minefield put out with a different build id. So if I want to continue my tests from last day, I have to fool Litmus into thinking I am still running the old build id. Perhaps we can program a window of build ids into Litmus?

My final idea I had was related to Bugzilla. Is there a way we can implement some kind of duplicate bug detection like Digg's duplicate story submission detection. For those that do not know, when a user submits a story to Digg, it checks to see if that story might already be submitted. I believe this is based on URL and the title and description the user gives for the story. It then returns a list of possible duplicate stories that the user can read to verify that they aren't dupes or submit their story anyway. I know the system isn't perfect, but I believe that we could learn something from this and that it would make Bugzilla better. Both from the end-user perspective and from Bugzilla's maintainers' perspective. I realize that this will not help with our current duped bugs, but it will help as a preventative measure for avoiding future duplicate bugs.

Before I leave you for today, I would like to mention an idea that tchung mentioned to me today. What if testers on Litmus could vote on whether they found a certain test useful or not? Like digg, reddit, delicious, or any of those sites. Sounds kind of interesting to me.

Anyway, that is all for today.

Stay tuned for another post later this week.

ashughes@Mozilla

Monday, June 11, 2007

Litmus - An Analytical Approach

1 comment:

Why R U Here?

Blog Archive

Word of the Day

Today's Featured Wiki

Picture of the Day