Ridzwan, do you want to use the response optimizer for a doe? TIA Barbara. Barbara B said:. Barbara B Apr 18, To use the Response Optimizer you have to treat your Taguchi design like a factorial design. The math behind is identical, but Taguchi uses a different approach for optimization than the methods given in the Response Optmizer. To use the response optimizer you have to analyze the responses separately first and then optimize the results. In this post and the following you'll find several screencasts which show the steps to achieve an optimum.
And last but not least the use of the response optimizer see attached file. Barbara, I was see your video, and have several point that i not understand. Thank you very much Barbara. Barbara, I need your help again. I was try 12 factors and 3 levels to obtain the response optimizer by following your nice video. However, i cannot do that because the button of response optimizer not appear.. I also attached the design below.. Can you give any advices..? Thanks a lot Barbara Ridzwan. Barbara, It that true. When the factors more than 2 levels, the response optimizer cannot works..?
Please help me.. Miner Forum Moderator Staff member. Apr 19, The response optimizer will work for 2-level factorial designs and response surface designs. It is not available for General full factorial designs. I do not believe that it is available for Taguchi designs either. In those cases, you must simple select the best levels manually. Barbara B Apr 19, Ridzwan, sorry for the insufficient postings before. What methods you used to eleminate factors during pareto chart? It will be analyzed using a regression approach. However, i cannot do that because the button of response optimizer not appear.
When the factors more than 2 levels, the response optimizer cannot work? You must log in or register to reply here. The chapter concludes with consideration of a variety of methods for data entry in Minitab, of data manipulation and of the detection of missing and erroneous data values. When was this value obtained? What time period does that value represent? Because of this association with time there is information contained in the time-order sequence of the values. To recover this information you will need to look at your data in a time-ordered plot, which is commonly called a running record or time-series graph.
In order to introduce running records or run charts, consider the time series of weights g of glass bottles in Table 2. Bottle weight is a key process output variable in the food packaging industry. Each bottle was formed in the same mould of the machine used to produce the bottles and the time interval between sampling of bottles was 15 minutes. The target weight is g and the production run was scheduled to run for a total of 12 hours. On opening Minitab the screen displayed in Figure 2. Two main windows are visible. The Session window displays the results of analyses in text format; initially it displays the date, time and a message of welcome.
It is also possible to perform tasks by entering commands in the Session window instead of using the Minitab menus. The Data window Table 2. It is possible to use multiple worksheets within a single project. Note that the blue band across the top of the Session window is a deeper colour than that across the top of the Data window, indicating that the Session window is currently active.
Take a moment to click on the blue bands to make the worksheet active and then the Session window active again. A third component, in addition to the Session and Data windows, of the new Minitab project just opened up, is the Project Manager, which is minimized at this stage. Note the corresponding icon labelled Proje. Figure 2. Clicking on an icon makes the corresponding component active. Note that the message displayed indicates the project component associated with the icon and gives the keys that may be used to make the component active as an alternative to clicking on its icon.
Enter all 25 of the weight data values in Table 2. In order to access the dialog box for the creation of a run chart first click the Stat menu icon see Figure 2. The Run Chart dialog box can now be completed as shown in Figure 2. Highlight Weight g in the window on the left of the dialog box and click on the Select button so that Weight g appears in the window labelled Single column:. Alternatively highlight Figure 2.
Weight g and double-click. Enter Subgroup size: 1 in the appropriate window as each sample consisted of a single bottle. Had the weights of samples of four bottles been recorded every15 minutes then Subgroup size: 4 would have been entered. Clicking the Options. Click OK to return to the main dialog box and then click OK again to display the run chart — see Figure 2. Those involved in running the process can learn about its performance from scrutiny of the run chart. Weight is plotted on the vertical axis, with the weight of each bottle represented by a square symbol and the symbols connected by dotted lines indicating the sequence of sampling.
Had, for example, four bottles been weighed every 15 minutes then Minitab offers the choice of a run chart with symbols corresponding to either the mean or median of the weights of each sample or subgroup of four bottles. The horizontal axis is labelled Observation — each sample of a single bottle may be thought of as an observation of the process behaviour. The horizontal line on the chart corresponds to the median weight of the entire group of 25 bottles weighed. The median weight is Scrutiny of the run chart reveals 12 points above the median line, 12 points below the median line and the point Figure 2.
The sample of five bottle weights The sample of six bottle weights In this case, where there is an even number of bottles in the sample, the median is taken as the value midway between the middle pair of values, i. The median of This is the sort of question on which statistics can shed light — further discussion of such questions and the tools available in Minitab to answer them appears in Chapter 7.
However, before considering the performance in relation to any target or specification limits, a much more fundamental question should be asked: is the process performing in a stable, predictable manner? The information displayed beneath the run chart is relevant. For an explanation, the Help facility provides the overview of run charts displayed in Figure 2.
One way to access this information is to click on the Help button in the bottom left-hand corner of the Run Chart dialog box — see Figure 2. Note also the provision of how to, example, data and see also links to further sources of information to aid the user to learn about run charts. In addition, an explanation of the dialog box items is given and a link to information on the options available in creating a run chart in Minitab via. Links to related topics are also provided within the text — in the case of the run chart the related topics are subgroup and median.
A process performing in a stable and predictable manner is said to exhibit common cause variation only and to be in a state of statistical control. When a process is affected by special cause variation, i. Data for a process affected only by common cause variation exhibit randomness while data for a process do not. In order to conduct these one must scrutinize the P-values in the text boxes beneath the run chart.
P-values will be explained in Chapter 7, but at this stage one need only know that it is generally accepted that any P-value less than significance level or a-value of 0. For the weight data none of the P-values is less than 0. The run charts in Figures 2. In the first scenario, displayed in Figure 2. The line segment superimposed on the display indicates an apparent initial downward trend in weight. In the second scenario, displayed in Figure 2.
The rectangle superimposed on the display indicates a period during which weight oscillates rapidly. In the third scenario, displayed in Figure 2. The rectangle superimposed on the display indicates a period during which there is an absence of weight values close to the median weight represented by the reference line — typical when mixtures occur. In the fourth scenario, displayed in Figure 2. Two clusters — groups of points corresponding to bottles with similar weights — are indicated in the display. A process team should respond to evidence of special cause variation by taking steps to carry out a root cause investigation in order to determine the extraneous factor or factors affecting process performance.
Once any such factor or factors have been identified, steps may be taken to eliminate them. It should be noted that a signal of evidence of the presence of special cause variation from a run chart P-value less than 0. The reader is urged to tap the huge Minitab Help resource constantly. Further details are provided in Chapter 11, and the author suggests that it will be beneficial to refer to these details in parallel with study of this and later chapters. Minimize the run chart and note how the text run chart of Weight g has appeared in the Session Window indicating that in the Minitab session to date a run chart of the weight data has been created.
The display in Figure 2. The contents of the open Session folder indicate the date and time of the creation of Worksheet1 and the subsequent display of the data in the run chart. The run chart is in the Graphs folder. On opening the ReportPad folder, a report document may be created with appropriate text being entered as shown in Figure 2. Sample Weight 26 27 28 29 30 Alternatively, on right-clicking the active run chart a menu appears; clicking the option Append Graph to Report adds the chart directly to the ReportPad.
The typical final step in such a first session with data from a process is the naming and saving of the Minitab project file. The project file will be created as Weight. MPJ, with the extension. MPJ indicating the file type as a Minitab project. Imagine that a discussion took place with the process manager on concern about bottle weight being on target and that he consults a Six Sigma Black Belt, who does some further analysis of the available data using Minitab and reassures the process team that there is no evidence to suggest that the process is off target.
As production of the batch of bottles continues, further data became available which are displayed in Table 2. Launch Minitab. MPJ created and saved earlier. The Toolbar at the top of the screen may be used to access components of the Project as indicated earlier Figure 2. Click on the Current Data Window icon or on the Show Worksheets Folder icon to access the only worksheet currently in the project.
Add the additional data to the first column of the worksheet. The updated chart is displayed in Figure 2. The P-value for clustering of 0. Scrutiny of the run chart reveals that the additional data points form a cluster, and scrutiny of the actual data values in Table 2. Thus it would appear that corrective action, to remove a special cause of variation affecting the process, could be necessary. On accessing the ReportPad via its icon one can type appropriate further comments and add the updated run chart as shown in Figure 2.
Bottle weight is a key process output variable in this context. People involved with the process will have the knowledge of the key process input variables that can be adjusted in order to bring weight back to the desired target of g. You can readily check the second median by putting the data in Table 2. However, the objective evaluation of evidence from data using sound statistical methods is preferable to subjective decision-making. Right-clicking on an active run chart and then clicking on StatGuide opens a window with a display of Contents on the left of the screen and with Index and Search tabs.
On the right specific information on run charts is given. Arrows enable navigation around the topics provided, example output is given, and the location of the data used to create it is indicated together with interpretation. The More button leads to in-depth details of how the interpretation is made. The reader will find further details of Minitab StatGuide in Chapter 11, and the author suggests that it will be beneficial to refer to these details in parallel with study of this and later chapters.
Having completed your work with the bottle weight data discussed above, the natural thing to do would be to save the updated Minitab project file Weight. It is recommended that you save the worksheet containing the 40 weights using the name Weight1. The worksheet file will be created as Weight1. MTW, with the extension. MTW indicating the file type as a Minitab worksheet. The response variable, weight, considered above is an example of a continuous random variable in the jargon of statistics. For a second example of a run chart the number of incomplete invoices per day produced by the billing department of a company will be considered.
The daily count of incomplete invoices is referred to as a discrete random variable in statistics. It should be noted that both weighing bottles and counting incomplete invoices are examples of measurement. It is recommended that you download the files and store them in a directory on your computer.
The data for this example, available in Invoices1. Had you omitted to save the updated bottle weight project you would have been offered the option of doing so on clicking OK. It is strongly advised that you save projects as you work your way through this book as many data sets will provide opportunities for analysis using other methods in later chapters.
A new blank project file is opened. In order to save the reader the tedious task of typing in the initial invoice data displayed in Figure 2. Select the file Invoices1. MTW from the directory in which you have stored the downloaded data sets and click on Open. You may be asked for confirmation that you wish to add a copy of the content of the worksheet to the current project, in which case it is necessary to click OK. There are three columns in the worksheet displayed in Figure 2. The first is labelled C1-D, indicating that it contains date data. The columns labelled C2 and C3, with no extensions, hold numerical data — the daily number of invoices processed and the daily number of invoices found to be incomplete.
A run chart of the number of incomplete invoices per day could be misleading since the number of invoices processed daily varies. Thus there is a need to calculate the proportion of incomplete invoices per day. In the window labelled Expression: the formula may be created by highlighting the names of columns, using the Select button and the calculator keypad. Clicking OK implements the calculation. Note that a menu of functions is available for use in more advanced calculations.
Note too that if one checks the Assign as a formula box then whenever additional data are entered in the second and third columns the percentage incomplete will be calculated automatically. Columns that have been assigned a formula are indicated by a green cross at their heads. Moving the mouse pointer to the horizontal reference line, representing the median percentage incomplete on the chart, triggers display of a text box giving the median as Thus the current performance of the invoicing process is such that approximately one in every six invoices is incomplete.
The percentage for the sixth day appears to be considerably higher than all the other percentages — in fact, a new inexperienced employee processed many of the invoices during that day. A median of The run chart in Figure 2. As an exercise the reader is invited to access, in a new project, the updated data stored in the worksheet Invoices2. MTW and re-create this run chart. In doing the calculation of percentage incomplete on this occasion, start to create the Expression: required by selecting the Round function under Functions:.
Choose All functions from the menu if not already on view , scroll down the list, highlight Round and click Select. On clicking OK the proportions of incomplete invoices as percentages will be calculated and rounded to one decimal place. The updated run chart in Figure 2. Double-clicking an axis label yields an Edit Dialog Label box in which any desired label may be entered in the Text: window. Scrutiny of the updated run chart suggests that the process changes have been effective in lowering the percentage of incomplete invoices.
Two of the P-values are less than 0. This provides formal evidence of a process change having taken place. Alternative data displays, using methods discussed later in the chapter, may be used to highlight the apparent effectiveness of the process changes. The median for the period from 31 January onwards was Here we recorded a single variable for each bottle so we refer to univariate data. Had we measured weight and height we would have had bivariate data, had we measured weight, height, bore, out of vertical etc.
The process appears to have been behaving in a stable, predictable manner during the period in which the data were collected. When a process exhibits this sort of behaviour and the measured response is a continuous variable, such as weight, then display of the data in the form of a histogram is legitimate and can be very informative. Once a bottle has been weighed, imagine that it is put in one of the series of bins depicted in Figure 2. The lightest bottle recorded weighed The second lightest bottle weighed The next two lightest bottles weighed The heaviest bottle weighed The convention adopted in Minitab is that a bottle weighing For the complete sample of 25 bottles the number of bottles or observations in each bin is known as its frequency.
The ranges A chart with weight on the horizontal axis and frequency represented on the vertical axis by contiguous bars is a histogram. In order to work through the creation of the histogram with Minitab you require the weights in Table 2.
Six Sigma Quality Improvement with Minitab - PDF Free Download
They are provided in worksheet Weight1A. The initial part of the dialog is displayed in Figure 2. Accept the default option of Simple and click OK to access the subdialog box displayed in Figure 2. In the Graph variables: window, select the variable to be displayed in the histogram, Weight g in this case, and click OK. The histogram displayed in Figure 2. Note that moving the mouse pointer to a bar of the histogram leads to the bin interval and frequency for that bar being displayed in a text box.
In Figure 2. The frequency for this bin was 5, indicating that five of the 25 bottles in the sample had weight in the interval The histogram of weight for a sample of bottles in Figure 2. This may indicate that the sample includes bottles formed by two moulds, that the process has been run in different ways by the two shift teams responsible for its operation, etc.
The third bin has midpoint and the corresponding bin range or interval is In order to change the bins used in the construction of the histogram in Figure 2. Editing the list of cutpoints to become , as displayed in Figure 2. Note that the bimodal nature of the distribution of bottle weights is no longer evident. Thus potentially important information in a data set may be masked by inappropriate choice of binning intervals. Further reference to distribution shape will be made later in the chapter. Location gives an indication of what is typical in terms of process performance.
Suppose that the weight data displayed in Figure 2. Part of the supplied worksheet, Weight3. MTW, containing the data is displayed in Figure 2.
It shows the final four bottle weights for machine A and the first four bottle weights for machine B. Column C1 contains text values A and B indicating which of the two machines produced the bottle with weight recorded in column C2. Note the designation of the first column as C1-T, indicating that it stores text values.
Click on Multiple Graphs. Finally click OK, OK. The two histograms are shown in Figure 2. The histogram for machine A is in the left-hand panel and that for machine B is in the right-hand panel. With the graph active, the Edit Scale menu was accessed by moving the mouse pointer to the horizontal X axis and double-clicking when the text box displaying the text X Scale appeared. The entries in the window labelled Positions of ticks: were changed to , , and The triangular markers were superimposed using the polygon tool from the Graph Annotation Tools toolbar, but the detail need not concern us here.
These marks indicate the mean weight for each machine. The mean will be defined later in this chapter. The markers indicate the horizontal locations of the centroids of the histograms. Cut-outs of the histograms would balance on the knife-edges represented by the upper vertices of these triangles.
In terms of the target weight of g for the bottles, it is clear from the data display that both machines are operating off target. The difference in location for the two machines and in their performance, relative to the target bottle weight of g, may be highlighted as shown in Figure 2. Details of how to do this will be the subject of an exercise at the end of this chapter.
In addition to visual assessment of location from display of the data it is possible to measure location by calculation of descriptive or summary statistics. The median is a widely used measure of location and was referred to in the previous section in relation to run charts. The mean is a second widely used measure of location and is obtained by calculating the sum of data values in a sample and dividing by the sample size, i.
In common parlance many refer to the mean as the average. Calculation of the mean with associated statistical notation is given in Box 2. The means of bottle weight for machines A and B are The triangular markers in Figure 2. The Figure 2.
Minitab vs Jmp
Consider a sample of four bottles with weights g The sum of the four data values is With an even sample size of , the median is the mean of middle two weights when the data have been ordered. Note that the means and medians for each machine are very similar: This is typical when distributions and associated histograms are fairly symmetrical.
In such cases it would not really matter which measure of location is used to summarize the data. The icon to the left of the Display Descriptive Statistics. The standard deviation is a widely used measure of variability which will be introduced later in this chapter. Weight g is entered under Variables: and Machine entered in By variables:. In order to obtain the mean and median for the two machines, select Weight g in the Variables: window and use the Statistics button to edit the list of available Statistics to the ones shown in Figure 2.
Mean, Median, N nonmissing and N missing. On implementation of the procedure the output in Panel 2. The N column indicates that the data set includes values of weight for bottles from each of the machines, A and B. The final two columns give the means and medians. Note that, with the mouse pointer located in the Descriptive Statistics section in the Session window, a right click displays a pop-up menu through which access to StatGuide information on Descriptive Statistics may be obtained.
Consider now data on length of stay days in hospital LOS for stroke patients admitted to a major hospital during a year.
Box Cox Transformation with Minitab
The data are available in LOS. A histogram of the data is shown in Figure 2. Such data could be highly relevant during the measure phase of a Six Sigma project aimed at improving stroke care in the hospital. The histogram is far from symmetrical. With the long tail to the right it exhibits what is known as positive skewness or upward straggle. A histogram that had the shape of the mirror image of the one in Figure 2. Of course LOS cannot be a negative number, so a more logical set of bins would be 0, 10 10, 20 20, 30 etc.
This gives the histogram in Figure 2. The default descriptive statistics provided by Minitab for length of stay are shown in Panel 2. SE Mean denotes the standard error of the mean, and Q1 and Q3 denote the first and third quartiles respectively. These statistics are explained later in the book. As is typical with data exhibiting positive skewness, the median is less than the mean.
In this case the median length of stay is approximately one week less than the mean. One final facility covered in this section is that of being able to add reference lines corresponding to values on the horizontal scale. This is useful for giving a visual impression of how well a process is performing in terms of customer requirements. For example, suppose that a customer of the bottle manufacturer specifies that bottle weight should lie between and g.
Thus the customer is specifying a lower specification limit LSL of g and an upper specification limit USL of g. For the first sample of bottle weight data Weight1A. MTW presented in this chapter, having selected Weight g as the variable to be graphed in the form of a histogram, click on the Scale. Clicking OK, OK twice yields the histogram with reference lines indicating the specification limits shown in Figure 2. In Chapter 6 indices for the assessment of how capable a process is of meeting customer specifications will be introduced. In order to introduce measures of variability, consider two samples of five bottles from two moulding machines, P and Q.
Set up the data as shown in Figure 2. The dot plot is a useful alternative form of data display to the histogram, especially for small samples. Weight g is selected in Graph variables: and Machine in Categorical variables for grouping:. This yields the display in Figure 2. Both samples have mean There is greater variability, or spread, for weight in the case of machine P than in the case of machine Q. The reader is invited to verify that the default set of descriptive statistics for the two machines displayed in Panel 2.
One measure of variability is the range, i. Using the minimum and maximum values in Panel 2. The greater range for machine P indicates the greater variability in the weight of bottles produced on it than the variability in the weight of bottles produced on machine Q. The range has applications in control charts for measurement data. However, one criticism of the range as a measure of variability is that it only uses two measurements from all the measurements in the sample.
The standard deviations StDev given in Panel 2. The greater standard deviation for machine P indicates the greater variability of bottle weight for it compared with the variability of bottle weight for machine Q. Detailed explanation of the calculation of standard deviation for machine P is given in Table 2. Panel 2. This indicates that the third bottle had a weight 0. Similarly, for example, the fifth bottle had weight 2. The deviations from the mean always sum to zero. The absolute deviation ignores the sign of the deviation and simply indicates by how much each measurement deviates from the mean.
The mean absolute deviation MAD is 1. The reader is invited to verify that the corresponding value for machine Q is 0. The greater mean absolute deviation for machine P indicates the greater variability for it compared with that for machine Q. Although MAD is a perfectly viable measure of variability it has disadvantages from a mathematical point of view.
An alternative approach to taking the absolute values of deviations is to square the deviations and to take the mean of the squared deviations as a measure of variability. The mean squared deviation MSD for machine P is 3. Once again the greater mean squared deviation for machine P indicates the greater variability for it compared with that for machine Q. However, there are two disadvantages with MSD. First, since the deviations are in units of grams g the squared deviations are in units of grams squared g2.
Second, samples are generally taken from populations in order to estimate characteristics of the populations, but statistical theory shows that MSD from sample data underestimates MSD for the population sampled. For example, in the case of a production run of bottles the population of interest would be all bottles produced during that particular run. An important measure of variability is sample variance, which is calculated as the sum of squared deviations divided by the number which is one less than the sample size. Thus for machine P the variance is given by Finally, in order to get back to the original units, the sample standard deviation is obtained by taking the square root of the variance.
This yields a standard deviation of 2. The reader is invited to verify that the standard deviation for machine Q is 1. The main point is that variance and standard deviation are very important measures of variability — the technical details of the underlying calculations are not important. In broad terms all three measures indicate that the variability or spread of the weights for machine P is approximately twice that for machine Q. Small artificial samples were used here for illustrative purposes. One would be very wary of claiming that there is a real difference in variability in the weights of bottles produced on the two machines on the basis of such small samples.
Readers who wish may examine the mathematics of the standard deviation in Box 2. Consider again the sample of four bottles, referred to in Box 2. The mathematical shorthand for the Population mean and population standard deviation are widely denoted by the Greek symbols m and s, respectively. In statistics, a fairly general convention is to denote population values parameters by Greek letters and sample values statistics by Latin letters. Data may be stored in the form of columns, constants or matrices. The latter two forms will be introduced later in the book.
The key scenario is one of variables in columns and cases in rows stored in a worksheet. The fundamental method of data entry is via the keyboard directly into the Data window. Data may also be accessed via Minitab worksheet and project files created previously or from files of other types. In order to introduce aspects of both data input and manipulation consider the data displayed in Figure 2.
It gives, for a period of 6 weeks, the number of units per day that fail to pass final inspection in a manufacturing operation that operates 24 hours a day, 7 days a week. The data set used in this example is relatively small, as are those in examples to follow, and some tasks carried out using Minitab could be done more quickly by simply retyping the data in a new worksheet!
However the aim is to use small data sets to illustrate useful facilities in Minitab for the input and manipulation of data. As indicated in Figure 2. Clicking Open enters the data into the Data window. Note the list of file types catered for. Note the descriptive icons positioned beside many of the items on the Data menu in Figure 2. The columns to be stacked are selected as indicated in Figure 2.
Selection may be made by highlighting all six columns simultaneously and clicking the Select button. The option to Store stacked data in:New worksheet was accepted and Name: Daily Failures specified for the new worksheet. In addition, the default to Use variable names in subscript column was accepted. On clicking OK the new worksheet is created.
Column C1 is named Subscripts by the software and column C2, containing the stacked data, is unnamed. Note in Figure 2. Note that the dialog displayed in Figure 2. The data are now arranged in time order and a run chart may be created — this is left as an exercise for the reader. The Excel workbook Failures2. Having opened the Excel workbook as a Minitab worksheet, one may create a column with the data in time order in the same worksheet.
Store stacked data in: C12 indicates that the stacked data are to be stored in column C The option Store row subscripts in: was checked Figure 2. Alternative layout for failure data. The new columns require naming. Day is already in use as a column name in the worksheet so the name Day of week was used for the new day column. Column names cannot be duplicated in Minitab. A portion of the stacked data is shown in Figure 2. The allocation of columns for storage of the stacked data and the subscripts in the order C12, C10 and C11 will now appear logical!
Column names could have been entered directly during the dialog. Names, Figure 2. The reader should not be afraid to experiment — if an initial attempt to achieve an objective fails then try again or seek assistance via Help or StatGuide. Again the data are now arranged in time order and a run chart may be created. Suppose that production actually involves two lines, A and B, and that the data stratifies by line as shown in the Excel worksheet displayed in Figure 2.
The data are available in the Excel workbook Failures3. On opening the workbook as a Minitab worksheet the data appear as shown in Figure 2. The first step in overcoming this is to highlight the entire first row of worksheet entries by clicking on the row number 1 at the left-hand side of the worksheet. On doing this the row will appear as in Figure 2.
The next step involves use of the facilities for changing data types available via the Data menu. The six types of changes that may be made are indicated in Figure 2. Here we require a change of data type from text to numeric. The 12 text columns containing the numerical data are specified in Change text columns: and the same column names are specified in Store numeric columns in:. Having changed the data type to numeric, C2-T becomes C2 etc. The completed dialog box is shown in Figure 2. The six blocks of columns to be stacked are specified under Stack two or more blocks of columns on top of each other:.
In the dialog box shown in Figure 2. The default option to Use variables in subscript column has also been accepted. This subscript column will appear to the left of a pair of columns, the first of which will contain the sequence of daily failure counts for line A and the second those for line B. It would be informative to see a display of run charts for both lines on the same diagram. Choose the Multiple option, select the two columns containing the data to be plotted in the Series: Figure 2.
Double-clicking on each of the axis labels enables the labels to be edited appropriately. The plot is displayed in Figure 2. Clearly line B has a higher daily rate of failures than line A. You should verify that the medians are 3 and 5 failures per day for lines A and B, respectively.
The stratification of the Figure 2. Given that the lines have identical production capacity, quality improvement could potentially be achieved through investigation of factors contributing to the poorer performance of line B. In order to illustrate further aspects of data manipulation the reader is invited to open the worksheet Pulse. MTW from the Sample Data folder. The worksheet contains data for a group of 92 students. In an introductory statistics class, the group took part in an experiment.
Each student recorded their height, weight, gender, smoking habit, usual activity level, and pulse rate at rest. Then they all tossed coins; those whose coins came up heads were asked to run on the spot for a minute. Finally, the entire class recorded their pulse rates for a second time. The data for the final ten students are shown in Figure 2. In this data set codes are used for gender in the column labelled Sex, with 1 representing male and 2 representing female.
Before analysing a data set it is always wise to check for any unusual values appearing because of errors in data entry etc. One simple check would be a tally of the values appearing in the Sex column. By default Counts are given but the Session window output in Panel 2. Thus there were 92 students in the class, of whom 57 were male, i. Suppose we wish to replace the numerical codes 1 and 2 with the words Male and Female respectively in the Sex column. One reason for using text rather than numerical values, for example, is that displays of the data can be created having more user-friendly labels.
The dialog is shown in Figure 2. Code data from columns: Sex and Store coded data in columns: Sex means that the coded text values will be stored in the same column of the worksheet as the original numerical codes. Under Original values: note that 1 and 2 have been entered, with the corresponding replacement codes of Male and Female respectively specified under New:.
All eight columns were selected as the source of the data to be unstacked in Unstack the data in:. This may be done by highlighting all eight variables and clicking the Select button. The subscripts to be used for unstacking are in the Sex column and this is indicated via Using subscripts in:. In the dialog box in Figure 2. The default option Name the columns containing the unstacked data was accepted. Thus names would automatically be assigned to the columns containing the unstacked data.
The first eight columns of the new worksheet contain the data for the females while the second eight columns contain the data for the males. Minitab has used the subscripts employed for unstacking the original data to extend the original column names appropriately. Releasing the mouse button and pressing the delete key completes the operation. Finally, double-clicking each of the remaining column names in turn enables them to be edited to their original form as shown in Figure 2.
The adjacent Height and Weight columns may be moved first as follows. Click on C5 at the head of the Height column, keep the mouse button depressed and drag across to C6 so that the Height and Weight columns are highlighted as shown in Figure 2. Further use of the facility for moving columns yields the worksheet in the desired format.
Under Save as type: the default is Minitab. If this option were selected then the worksheet could be saved, for example, as Females. The reader should note the other available file types and observe, when using Windows Explorer, the subtle difference between the icons used for Minitab project and worksheet files. In this section a number of methods of data acquisition in Minitab have been considered. There are situations where the capture of data in real time is of interest. Though Minitab was Figure 2. The data cover the period from October to May A clinical improvement project was commenced in September in order to improve waiting times for the patients.
The data are stored in the supplied Excel spreadsheet Colposcopy. On opening the file as a Minitab worksheet it appears as displayed in Figure 2. The first column is the patient reference number, the second gives the date on which the need for an outpatient appointment for a colposcopy was established and the third gives the date on which the procedure was actually carried out. Note that Minitab has recognized the data in the second and third columns as dates; this is indicated by the column headings C2-D and C3-D.
The aim of the example is to demonstrate how to create a run chart of monthly means of waiting times to indicate process performance before and after process changes. Before creating the run chart, screening of the data for anomalous values will be carried out. This process introduces a number of important facilities in Minitab for the manipulation of data. Calculation of waiting times Calculations may be performed on data in the form of dates. Note how the waiting time for the first patient was 25 days and that the waiting time for the seventh patient is, of course, a missing value.
Note, too, the small green cross at the head of the column of Wait values, indicating that the formula used in the calculation has been assigned to the cells in the column. This means that should any dates in the second or third columns be changed, Wait will be automatically Figure 2.
In addition, if dates for further patients were to be added then their Wait values would be calculated automatically. In the dialog box enter Rank data in: Wait and Store ranks in: Rank. Note that the Wait of 25 days for the first patient is ranked When two or more patients have the same value for Wait then the mean of the corresponding ranks is allocated as Rank for these patients. Note that all five columns should be selected in the Sort columns s : window and that sorting By column: Rank is specified. Sorting in ascending order is the default so Descending is left unchecked.
Store sorted data in: Ordered data was used to name the new worksheet. The first few rows of the sorted data are shown in Figure 2. Scrutiny of the ranked data reveals errors. This is impossible, so the data for patient number require checking. There were 19 patients with Wait values 0, which means that they underwent the procedure on the day that it was deemed necessary — an occurrence likely in situations where clinicians suspected a serious situation for the patient. These 19 patients would be assigned rank values ranging from 2 to 20 inclusive. These values sum to , which on division by 19 yields Thus the rank assigned to each of these 19 patients is The 21st and 22nd ordered Wait values were both 1 day, so the corresponding patients are assigned rank The final section of the sorted data is shown in Figure 2.
Further relevant information emerges. Five patients have no dates for the procedure recorded. The patient with reference number had a wait of days, which exceeds the length of the study period. Suppose that discussion with the project leader reveals that the colposcopy dates for patients and Figure 2.
Suppose, too, that she indicates that the patients with reference numbers 7, , and should be removed from the data set, as they had moved away from the area served by the hospital. There may still, of course, be further errors.
The worksheet of ordered data may be deleted and the appropriate corrections and deletions made in the original worksheet. Correcting the data The date corrections can be made first. An example of the dialog involved in locating a cell for correction is shown in Figure 2. Double-clicking the relevant cells enables the edits to be made.
The change for the patient with number may be made similarly. Note that in this case the patient number matches the row number and that the rows have to be deleted from all four of the columns available for selection. The Wait column is not available for selection as it was assigned a formula. Note that there are values of Wait, with none missing, and that the mean was The dialog is displayed in Figure 2.
With the selection of the four-digit Year component and the Month component in the Minitab dialog box, any referral date in October will be coded as , any referral date in November as etc. The means can then be copied from the Session window, along with the code for the months, and pasted into a new worksheet as follows. With the mouse pointer, click on the first cell in column C1 of the new worksheet before pasting, accepting the default setting to Use spaces as delimiters. The run chart displayed in Figure 2. It would appear that there has been Figure 2. Note that the P-values provide evidence of the presence of special cause variation.
The author has heard a run chart described as a naked control chart. In Chapter 5 the construction and use of control charts will be introduced. Chapter 4 will be devoted to the introduction of the basic concepts of probability and of statistical models that provides essential underpinning for control charts. For a familiar process obtain a sequence of measurements and display them in the form of a run chart.
If you do not have access to data from a workplace situation then journey durations, your golf scores etc. Do you consider the process to be behaving in a stable predictable manner? Use ReportPad to note any conclusions you make regarding process performance. Save your work as a Minitab project to facilitate updating the data and to provide a control charting exercise at a later date. You might wish to name the project file Ch2Ex1.
Display the data in the form of a run chart and comment. In the bottle weight example used in this chapter the sample subgroup size was 1. The worksheet Bottles. MTW contains data giving the weights of samples of four bottles taken every 15 minutes from a bottle-forming process. Open the worksheet and create a run chart. Here the data are arranged as subgroups across rows of columns C2, C3, C4 and C5, so this option has to be selected in the Run Chart dialog. The default is to plot the means of the subgroups of four weights with reference line placed at the median of the 25 sample means, i.
The means are plotted as red squares and connected by line segments, and in addition the individual weights are plotted as black dots. The alternative option is to plot the subgroup medians, in which case the reference line is placed at the median of the 25 medians, i. Note from the run chart legends that there is no evidence of any special cause variation. Stack the weights into a single column named weight and verify that the mean and standard deviation of the total sample of bottles are Create a histogram of the data with reference lines placed at the specification limits of Comment on the shape of the distribution and on process performance in relation to the specifications.
Save your work as a Minitab project. During the measure phase of a Six Sigma project a building society collected data on the time taken measured as working days rounded to the nearest day to process mortgage loan applications. The data are stored in the supplied worksheet Loans1. Display and summarize the data and comment on the shape of the distribution.
Use a stem-andleaf display to determine the number of times which exceeded the industry standard of 14 working days. The data are stored in Loans2. Display and summarize the data in order to assess the effectiveness of the project. The file Statin. Open the file in Minitab and create a column giving the monthly proportions of stroke patients on a statin at time of admission. Create a run chart of the proportions and comment. The Minitab worksheet Shareprice. For the Start Values, accept the default One set for all variables and enter 1 for Month and for Year.
Observe how moving the mouse pointer to a point on a plot leads to display of the variable, its value and the corresponding month and year. The histogram was introduced as a type of bar chart in which there are no gaps between the bars, indicating that the variable being displayed is continuous.
One can use the histogram facility in Minitab to display discrete random variables and to emphasize the discrete nature of the data by having gaps between the bars. The supplied worksheet AcuteMI. MTW contains daily counts of the number of patients admitted to the accident and emergency department of a major city hospital with a diagnosis of acute myocardial infarction. Create a histogram of the data, making use of Data View. Double-click on one of the project lines, select Custom and increase Size to, say, 5. Tukey, Overview Two tools of exploratory data analysis EDA , the stem-and-leaf plot and boxplot, provide versatile and informative displays of process data and enable outliers to be defined and detected.
The brushing facility in Minitab is introduced as it enables subsets of data sets displayed in graphs to be readily identified and explored. There are many situations where two or more performance measurements are made in assessing process performance, so some familiarity with techniques for the display and summary of bivariate and multivariate data is important.
The route from Loanhead, across the Forth Road Bridge, was 25 miles long. A run chart of journey duration minutes for 32 journeys undertaken during September and October is shown in Figure 3. The low P-value for clustering indicates the possible influence on journey duration of some special causes. In addition to journey duration, the weather was recorded as either dry D or wet W.
The author had a hunch that this uncontrollable! The data are given in Table 3. In order to compare journey durations under the two categories of weather conditions, dry and wet, and in order to make further exploration of the data easier, stem-and-leaf displays will be constructed.
- Crusade and Jihad: Origins, History, Aftermath!
- Human Biology. A Text Book of Human Anatomy, Physiology and Hygiene.
- How to Train Your Viking, by Toothless: Translated from the Dragonese by Cressida Cowell?
- Meet Minitab 15 User's Guide by Minitab Documentation - PDF Drive;
Scrutiny of the data reveals durations ranging from the thirties to the sixties. Stem values of 3, 4, 5 and 6 will be used corresponding to the thirties, forties, fifties and sixties. Each stem will appear twice in the display corresponding to the low thirties, the high thirties, the low forties, the high forties etc. Consider first the durations for dry days: 37, 40,. Thus the first dry day duration of 37 minutes is a value in the high thirties, i.
The second dry day duration of 40 minutes is a value in the low forties, i. In either case By variable: Wcode has to be specified as only numeric variables may be used in this way with the Stem-and-Leaf procedure. Table 3. The stems are shown in the second column and the leaves in the third column in Panel 3.
The first column contains what are known in the jargon of exploratory data analysis EDA as the values of depth. The first four depths listed, 1, 2, 6 and 10, correspond to the highest durations on each of the first four stems. Thus, for example, the highest duration of 47 on the third stem is 6 values deep into the ordered data set starting from the minimum. The final two depths listed of 10 and 6 correspond to the lowest durations on each of the final two stems.