Friday, October 06, 2006

Measurements and Statistics for Stovers

Our Berkeley Darfur stove development experience, comparing the efficiency of different designs, indicates that the noise in the measurements of the fuel use will make the clear demonstration of improvements in efficiency difficult – even with a very good testing method (designed just to reduce measurement uncertainties) a difference of 30% in performance between two tests (with different designs and/or cooks) may not be enough to show improvement. The statistical aspects of implementation are non-trivial and deserve plenty of attention, and good data retention practices – save me the data Brian and I’ll try to help crunch it.

In manufacturing we often have the need to compare two “populations” to see if there is a difference – say because the products are made on two different injection molding machines, are made during different shifts or at different factories, or because we are bringing out a new design. Here is how we go through the process, with just enough statistics to do the job:

  1. Develop a procedure – which is a carefully written procedure and a testing station – that seeks to minimize test to test variation, measuring the right things and using what we know about the product’s end application to make the test robust and reliable. For a stove we know that the start up phase is awkward so the test must be long enough to reduce the impact of this, and we know that losses from the pot due to evaporation are a problem because this changes from test to test. So we might decide that boiling more liters of water (a typical cooking surrogate – at this phase there may be no need to duplicate actual cooking, since that test comes later) for a longer time will reduce test-to-test variation. The test should be designed to most easily highlight the differences between a good product and a worse one, so extreme conditions may be considered – in the case of a stove perhaps a consistent wind is used (if this is a realistic condition) since this has been show to differentiate similar stoves.
  2. Analyze the test itself - the basic tool used is the “gauge R and R analysis” – examining how repeatability (the same person, testing over and over) and reproducibility (different people testing) affect the outcome of controlled testing. For this test with a particular stove, several cooks perform the same task several times each, then these results (say time to boil or amount of wood used) are put into a special spreadsheet to mathematically determine how much of the observed variation (ideally there should be none – it is the same stove after all) is due to the testing method and stove, and how much is due to the differences between the techniques of each cook. It’s a challenge, but this kind of approach – technical without being obsessive – may be the way that engineers can help most; in this manner new stove designs and implementation approaches can be clearly shown to be effective and funding agencies can feel that they can invest with confidence that they will get results and people will be helped. Until then, we will only be able to make crude wishful projections about what the potential impact of a project will be – and this is not good enough. More detail on R&R measurements (from NIST) at but remember that you just have to plug your measurement results into a spreadsheet and the conclusions fall out.
  3. Demonstrate that the testing process is “well behaved” – a test is not good enough if there are too many variables impacting the results We must try hard to make sure that the differences between tests are due only to random variations, such as weather conditions, slight operator differences, random construction variations between stoves, etc. Systematic variations might include gross differences between manufacturing shops, poorly trained operators, or changes in construction materials – these are things that we are testing for (lapse in quality) so the test itself should eliminate their impacts as much as possible… In statistics we say that the test results should follow a “normal distribution” (also called Gaussian, so that the variable that is being tested for – say the fuel consumption used for a specific task, such as boiling 5 liters of water for 45 minutes - has a central value that defines an average, and a bell shape due to the random fluctuations about this mean. There are many other types of population distributions (bimodal, etc.), but the normal distribution is the one that allows us to use the mean and standard deviation (and other traditional statistical tools) to describe its mathematical properties. In Excel you use the "frequency distribution" function to separate all your data points (each one is a value for fuel consumption from an individual test) into small ranges (or "bins") and then you plot the range values against the number of test results in each bin (just like in the figure). If it doesn't look like this figure then your testing method needs work for it to be reliable.
  4. Improve the testing if necessary – if the distribution is not normal, or the gauge R&R results are too large (i.e. we can’t reasonably tell if one stove is better than another, even though one should be) then the test must be modified. Examples of changes might be a clearer written procedure, better operator training, or actual test changes (such as narrowing the allowable temperature range in a simmer test or using a longer boil time, to reduce startup/transient effects or brief operator errors). Remember, a poor test wastes time – it may make more testing necessary (so that results can be averaged) or require superfluous testing (like continual testing of the control stove). The goal is the minimum testing that demonstrates the smallest acceptable difference between dissimilar stoves. The results of each individual test must be trustworthy, so that the results are as clear as they can be. The R&R method is well worth the effort – in the case of a 3x3 evaluation (3 operators test the same stove 3 times each – done just one time, just to evaluate the test procedure) it may take 1 day for the R&R but it cuts the new stove testing time in half (because the control stove no longer needs to be tested every time) for the rest of the program life! And regular R&R testing at different locations (such as at Khartoum and Nyala and the IDP camps) eliminates uncertainty about place-to-place testing variations – a real worry when testing is geographically distributed and can only be lightly supervised.

With Brian in Khartoum and now making stoves for Darfur, it is worthwhile talking about new product introduction quality control – there are several ways to make sure that your new stove performs in every way like you designed it to, and you don’t necessarily have to build a fire in each one. In manufacturing we use several statistical techniques to prove within a certain degree of confidence that your manufacturing effort is good. The process goes something like this:

  • Determine what it is about your stove that makes it work right – tight construction, the right pot gap width, weight (are the right materials used throughout?), time to boil X liters of water (firepower), amount of wood for a specific task (efficiency), etc. You are correlating some few things that says that new stove owners will be pleased, you want to make the correlation as reliable as possible, and you want the measurements be as easy and quick as possible.
  • For just a few stoves you’ll be handling every one to see if it feels right – if you know how your stove tests well enough then you should be able to tell bad product, but LOTS of stoves means too much handling, so this start up phase is the time to practice inspections and testing. Do sloppy stoves mean bad efficiency and short life in homes? How sloppy is too sloppy? Nothing is perfect so some issues are OK (small gaps, poor joints, etc.) and have to be passed, and others are unacceptable; don’t sweat the small stuff.
  • Decent measurements might be weight, air gap width using a standard pot (or no pot – just a measurement), and fuel efficiency. Often you can do the efficiency measurement only when you suspect it needs to be tested (new manufacturer, new materials, different city, etc.) – this test is the hardest and has the most error. In any case you should do a gauge R&R of the measurement, to see that the results are normally distributed so that you trust your measurements – how efficient your stove is will be something people want to talk about. Accumulate measurement numbers all the time and they will add up to a good record that someday you can publish to show the stove’s effectiveness.
  • Set specifications – create some measurements that show obviously that something is either acceptable or not. The stove maker can use them, you can, people in different cities can. Any spec is better than no spec – you can’t talk about quality unless you have even a tiny amount of information on how things vary. The challenge of doing this in strange countries is there, but maybe you will get lucky and everything will be perfect - there is no variation to measure!
  • The best quality technique is 6 Sigma (, and see the section on DMAIC) – only a few defective parts in a million allowed. Here you measure parts even as they are made, so bad stoves are never created. If the parts are handmade and there are weighing scales available, then using can be quick method of checking craftsmanship. And there are lots of variations of 6 Sigma – I practice the “lean” version where the main goal is to eliminate as many parts and operations during manufacture as possible. Less parts = less things to worry about. Under ideal circumstances you do very little inspecting at the end, since why would there be a bad stove? But if you can’t be there while things are made, all you can do is simplify the design, and ask to get the first few asap.
  • The government and other folks use the AQL method – specifying an Acceptable Quality Level – where there are tables to tell you how many out of each thousand to test, so that you don’t have too many bad stoves (all based on the statistics of a normal distribution). This website lets you plug your desired quality level into a on-line form and it tells you how many stoves to test, but you still have to decide what is an effective measurement. I have used this method but am not excited by it – it assumes that you accept bad parts, and that you don’t have good enough control of manufacturing. And this is true if you don’t have a good enough relationship with your machine shop.
  • Continuous improvement is always a part of the equation – your first stoves need to be good or people will be disappointed and then you’ll spend the rest of your life answering questions and fixing things. Half of quality is about not wanting to waste time like this. Nip problems in the bud as quickly as possible, bring things to the attention of your stove man and emphasize why problems hurt his chances of future business. Keep reminding them of their defects – details about problems will hopefully keep them thinking about improving. Unfortunately, this is about the time when you need competitors, so that you have negotiating power. Having a second supplier is a great thing, because distributing business between them (even if one is a little more expensive) keeps both on their toes – you can have one make a much smaller percentage, but not having two will cause you problems eventually.
References on stove measurements and performance statistics:
Next: Links just on stove implementation experiences from around the world

Monday, October 02, 2006

Berkeley Darfur Stove Project Well Underway!

After several stove design revisions and a good bit of what we hope is field appropriate stove testing, Brian (the project engineer-on-the-ground) has arrived in Khartoum – let the next phase of the project begin! For those of you who are catching up, LBNL researchers and UCB students have been working on a new wood/biomass burning cookstove for IDP (internally displaced people – a nice name for refugees who still live within their own country) in Darfur, Sudan. I won’t get into the political or humanitarian aspects of the problem here (see earlier posts and the links within them), suffice to say that 2 million people live in mass camps in western Sudan, supplied with food aid by international agencies, but they are not provided with fuel to cook it. The women are forced to hike ever farther away from the camps to find firewood in an increasingly denuded landscape – at serious risk to their security. A single round trip can take 7 hours and several trips a week are necessary. Presently they use mostly simple stoves and “3 stone fires”, which are not inherently fuel efficient – improved stoves which are suited for their environment and cooking situation will not restore their homes, but it would reduce their risk (by reducing the number of trips they need to make) and buy some time until the world can figure out how to get fuel trucked in, so that people are safer and the environment can recover from the deforestation caused by so many people in so small a place.

This research group developed a stove which addresses local needs – it fits traditional pots sizes/shapes, it can be staked to the ground to secure it during their vigorous stirring, the pots are shielded from the both of their wind to improve efficiency, it can be manufactured locally by craftsman metalworkers, and it saves sufficient fuel day-after-day so that the purchase price is affordable. To find ways to make it more manufacturable locally, they contacted Engineers Without Borders – engineering skills and experience were needed to make the implementation on a mass scale more likely. Western engineering methods and tools were then used to adjust the design, but the local materials, skills, tools, and cooking methods were always kept in mind – even the stove testing used actual pots (and onions stir fried in oil – to simulate their local mulah dish) to monitor the efficiency as different designs were evaluated.

Back to the present. Brian will stay in Khartoum (the far away capital, but the only city that has some of the metal working tools needed to see how stoves can be initially built) for 3 months, assessing the situation, developing local contacts for metal supplies and tools, establishing local efficiency testing for quality control of stoves, and diving into the local culture. He left the U.S. with the best information that we could supply him with – stove plans, possible methods of assembly, an assembled stove and several kinds of templates for teaching people how stoves should be made (so that usability and efficiency are not altered significantly), and the best stove testing methods that we know of. His computer contains SolidWorks software models of the stove – one of the tools that EWB contributed – so that he can demonstrate how the stove parts are assembled, what they do, why they are necessary, and he can alter the design as needed. This won’t help him directly in machine shops there, but it is a powerful tool for talking about stove designs and illustrating our approach. But all of this only tries to anticipate what he will find in Khartoum (let alone Darfur!), where we expect that nothing is what we think it is, despite several fact finding trips there to see what the stove building possibilities and use situations are.

Once he shakes of the jetlag and adjusts to the local accommodations and diet, he’ll explore the already identified metalworking shops, find more choices, see how local craftsmen make sheet metal products by hand, see if immediate modifications need to be made because of obvious local limitations (we hope not many – we tried to take many possibilities into consideration), and try and establish how prototype stoves might be made and how much they will cost to produce. Lots of flexibility and patience are expected to be necessary. The latest stove design left without any efficiency testing (it is hopefully just an incremental change to the previous tested design), but Brian has a stove testing lab with assistants there so he has some hands to help him. We tested in Berkeley (and Alameda – I am still trying to remove the soot stains) with simulated Sudan conditions, but how do they really cook there, and how will this change the testing procedure? What size pieces of wood do they use in the camps, can the stoves be elevated for ease of tending or must they be on the ground, and does the WBT adequately simulate local conditions (though it is expected that this will only be used for stove development – field cooking tests are the next phase of the project)? His charter is to start by developing the process for the first 50 prototypes stoves – we can only hope that he can start to understand real implementation in Darfur, where many other stove use questions need to be answered.

How do they feed their fires, how do they view any process of improving the efficiency of precious wood utilization, can their cooking processes be slightly modified just to save wood, will they share stoves between families, what challenges do they face in their lives that may hinder efficiency improvements (Dean Still’s the “sick child scenario” – real cooks cannot just focus on tending fires optimally), what further stove design changes will they ask for, and what do they value in stoves that we can emphasize so that efficiency improvements will not be lost over time.

It is my estimation that the stove design itself addresses some of the wood consumption problem, but just as important is how fires are tended – a program that just seeks to teach the best possible methods (no new stoves) would by itself have a tremendous impact on efficiency. What prevents IDP women from presently selecting the best present methods (copying the most efficient cooks), and disseminating these through the camps – to reduce fuel use on their own? This kind of cultural issue can only be answered by visiting, sitting, and learning. Too much to worry about for right now, especially with the present situation in the camps becoming more volatile, we can only start the process by making stoves available for the cultural evaluation phase – will people use them and will they indeed reduce fuel use? Once this is worked out, expanding manufacturing and working out the dissemination details will come next. I doubt that we will ever be done evaluating the impact of the whole effort (stove design and manufacture, plus effective implementation), and engineers can help with this as well – we are trained to collect data, evaluate it, then solve the problems identified by the analysis.

Next: Testing and Statistics for Stovers