Lab. 1. Orientation in software: Microsoft Excel

You need a diskette to do this lab.
Don't forget to save your work regularly because the system may crash any time, and you may loose everything you have done!

1. Open the Microsoft Excel window. Go through the Tutorial: select Help/Examples and Demos. Learn: Entering data, Creating formulas and links, Editing a worksheet, Formatting a worksheet, Creating a chart, Formatting a chart. If your computer has not enough memory to run tutorials, then use Help/Contents/Using Microsoft Excel; then go through Essential skills, Creating charts, Solving problems... [Using Solver, Statistical analysis of data].

2. Enter data into column A: 1.4 4.6 10.7. Edit data: change 1.4 to 11.4; change 4.6 to 5.6 Save data on your diskette: (a) insert a diskette (format it if it was not formatted before), (b) go to File/Save, (c) select the diskette (in the desktop), (d) enter the file name, (e) save.

3. Select a row (use the row button), a column (use a column button), a rectangle. Delete all data (select it and then delete). Delete the worksheet. Retrieve data from the diskette (File/Open...). Select 3 cells, copy them to the end of the column. Copy the first column to the second column, and to the third column. Select a rectangle (matrix) A1:B5, cut it and paste it into E1.

4. Delete all data. Put an arithmetic series 0, 1,..., 20 into column A: enter first 2 numbers, select them and then click on the small square at right-bottom of the selected block and drag it down to the cell A21. This operation is called autofill. Enter the equation =1/(1+exp(10-A1)), which is a logistic equation, into B1. When entering the equation, instead of typing A1 click with a mouse in cell A1 (this makes writing equations faster). Click on the small square at right-bottom of cell B1 and drag it down to the cell B21 (autofill). Click on any other cell in column B to see the equation.

Note: How cell references are automatically updated in these equations? What is the rule of equation change?

5. Modify the equation in cell B2: =1/(1+exp(10-$A$1)). Do not re-write the equation, but edit the existing one! Expand this equation to the entire column in the same way as before. Select any cell in column B and check if the equation is different from that in cell B1.

Notes:

  • Dollar-sign in the cell reference prevents the change of this reference.
  • Dollar-sign can be used before the column letter, or before the row number, or before both.
Now restore the original equation in the entire column B: make Edit/Undo. This will undo the autofill operation (note that you cannot make Undo for 2 operations!). To restore the equation in the first cell, select the second cell in column B and autofill it up to the cell B1. Check the equation in cell B1 if it has been restored.

6. Tables. Insert a row at the top of the spreadsheet and put column names there: "Time" for column A; and "Population density" for column B. Change font in all cells to Times New Roman, 12 pt. Select 2 cells with column names. Put a line at the top and a double line at the bottom of this row (Format/Cells/Border). Also but the line at the bottom of the table. Select the first column. Change the format of numbers so that all numbers will have 1 decimal digit after the dot. Select the second column. Change the format of numbers so that 3 digits will be shown after the dot.

7. The title "Population density" does not fit into the cell. Select both column titles and format its alignment (Format/Cells/Alignment). Center it horizontally and vertically, use word wrap. Then change the width of the first row: click between row buttons at the left and drag. If necessary, change the length of columns in the same way. Now we can copy this table into the Microsoft Word document. Open Microsoft Word, select the table in Excel and copy it to Word. You can edit the table in word: change font, line spacing, borders, alignment, etc.

8. Graphics. Select columns A & B and click on graphics (magic stick). Select an area for the graph by dragging the mouse in the worksheet. Select "scatter" graph with lines. Go through all menus, name the graph: "Logistic curve", use first column as the X-axis.

9. Modify the graph: double-click on it. Then double-click on the line, on axes, etc. Change line width, style and color, change markers, remove legend, etc. Go through all menus of editing. Resize elements within the graph.

10. The next step is to introduce parameters into equation. First, write the title "Parameters" in cell C1. Then enter parameter names: "maximum", "slope", and "center" in cells C2:C4 without quotation marks. If the graph covers these cells, then move it down with the mouse. Name the cells D2:D4 as "maximum", "slope", and "center", respectively. To name a cell select this cell, then go to Insert/Name/Define. The name will be suggested by the Excel; if you agree, click OK; if not, you can modify the name.

Notes:

  • You can name not only cells but also rows or columns!
  • Don't use single or double characters as names (e.g., "a", "bb", "b1") because these strings are reserved for column references or cell references!
  • A shortcut for Insert/Name/Define is Ctrl+F3 (on PC) or Apple+F3 (on the Mac). Try it, it works much faster!
  • If you select a cell with a name, then this name will appear in the field to the left from the equation editing field.
Put numbers 1, 1, and 10 into cells D1:D3 (which are named). Click on cell B1 and modify the equation as: =maximum/(1+exp(slope*(center-A1))). Autofill this cell to the entire column B to modify all equations. The graph should remain the same. Now modify parameters in cells D1:D3 and view the results. Change parameters in the following ranges: maximum=1:10; slope=0.2:5, centert=0:20. How each parameter affects the graph?

11. The next step is to fit our logistic model to some real data. First we need to free some space for entering our data in the spreadsheet. Select cells C1:D4 (block of cells with parameters) and move them to the right by 2 columns. To move a selected block, click on its thick border when the cursor changes its shape to an arrow, and drag the block to a new position.

We will use the following data of the increase in gypsy moth counts in pheromone traps in a series of years in the area that was initially uninfested:
0 1 2 1 0 5 10 8 30 70 150 300 400 800 900 1100 900 1000 1050 1200 900
You need to enter these numbers into the column C. This can be done by entering these numbers by hand, but it is nice to learn some computer tricks how to enter these data automatically. Copy the numbers above and paste them into a new MS Word document. Then open the Edit/Replace dialog box, put the space-character to be found and be replaced by a paragraph mark. To enter the paragraph mark click on the button "Special". Then select "Replace all". Now all numbers will appear in a column. Select all (Ctrl+A or Apple+A) and copy. Then go to the MS Excel, select cell C2, and open Edit/Paste-Special dialog box. Paste as text. Now all should be done. Make a header for column C: "Data" and format it if necessary. In cell B1, change the title to "Model".

12. Now we need to update our graph so that it will show both the data points and the model curve. Select the area A1:C22 and plot the scatter graph. Edit it so that data points are shown by markers, and the model is shown by line. To make the model closer to data, modify model parameters as follows: maximum = 800, slope = 0.5, center = 8. The result should look as follows:

13. The next step is to adjust model parameters so that the model will be closer to data points. We will use the least square method which is minimizing the squared difference between the data and the model. Estimate squared differences in column D. Enter equation =(B2-C2)^2 in the cell D2 and autofill it to the end of the column. Sum all numbers in column D: select cell D23 and click on the "sigma" button. Excel automatically selects the block for summation. If this block is not correct, you can change it using the mouse. Then click OK (green check button near the equation field). Now we need to modify model parameters so that the sum of squares (cell D23) is minimized. When the cell D23 is selected, click on Tools/Solver in the menu. If Solver is not shown in the menu, you need to go to Add-Ins and to install the Solver tool. If Solver is present but is grayed, it means that you are in the process of editing a cell, or in some other activity incompatible with running the Solver tool. In this case, finish editing, click on the cell D23., and then try using the Solver again.

When the Solver dialog box appears, check that cell D23 is the target cell. Click on the "min" radio button so that the target cell will be minimized. Then click on the field "By changing cells:" and select 3 cells that contain parameter values with the mouse. Then click "Solve" and watch how the sum of squares is reduced in the left bottom corner of the Excel window. After several iterations, the Solver will stop. Accept the solution (press OK) and see how the graph has changed. Does the model fit better to data points now?

14. In the menu select File/Print Preview. Use setup to change page orientation. Change margins and preview again. Go to Setup and delete grid lines. Print the graph.

15. Excel is mostly used for making budgets. Delete all previous data, and develop a budget for your future project. Imagine that you are a Principal Investigator and your position is Assistant Professor, 9 months appointment with a base salary of $35,000. Use form CSREES-55 as a pattern to create your budget in Excel spreadsheet. You need to put salary for yourself (3 months), include fringe benefits (24% of your salary), pay salary for a GRA ($11,200 per year), pay tuition for the GRA ($4,200, put it into "Other direct costs"), add any other costs you want. Specify indirect costs as 14% of total direct costs. Use equations to estimate percentages and to sum numbers. When the budget is ready, try to modify some entries. The table will be updated automatically. Print your budget.

16 The next exercise is to plot a 3-d graph (a 2-dimensional normal distribution). You will learn how to use dollar-sign in cell references to prevent the automatic update of either row number or column letter. Delete everything in the spreadsheet. Put numbers 0:20 in the column A2:A22; put numbers 0:20 in the row B1-V1 (cell A1 is empty). Put equation =exp(-(($A2-10)^2+(B$1-10)^2)/20) into cell B2. Note dollar-signs ($) which make a column or a row constant (not-adjustable). Autofill this equation in the column B2:B22. Then, autofill the entire column B2:B22 to the right up to column V. Look at the numbers. Do they look like a 2-d normal distribution? If not, check the equation. Look at equations at different cells and check if the dollar sign works. Plot the matrix B2:V22 as a 3-d surface. Double-click on the graph so that it will have a thick border. Then go to print preview, and finally print the graph. The spreadsheet will not be printed.

17. In the following exercises you will learn how to use statistical tools in Excel.
Delete everything in the spreadsheet and then fill columns A and B with the following numbers:
Column A: 16, 14, 14, 13, 20, 21, 23, 23, 26, 11, 14, 27, 10, 15, 23, 12, 16, 18, 28, 17, 18, 25, 24, 24, 30, 19, 14, 16, 26, 18
Column B: 37, 38, 38, 37, 40, 41, 40, 40, 43, 36, 36, 42, 37, 39, 39, 36, 39, 38, 43, 37, 40, 40, 40, 42, 42, 40, 36, 39, 43, 39

Note: use MS Word formatting to enter these numbers using copy and paste commands (see above).

Select: Tools/Data Analysis/Descriptive Statistics. Select input range A1:B30, check confidence intervals box, check the descriptive statistics box (at the bottom), select output range C1:E10, and press OK. What is Standard Error, Median, Mode, Standard Deviation, Variance, Kurtosis, Skewness, Confidence interval?

18. Delete columns C-F: selecting these columns and then go to the menu Edit/Delete. This is a good way to delete entire columns or rows. To make a histogram, you need to create a column with value thresholds (called bins in Excel). Excel can create bins automatically, but you have better control of the results if you make bins yourself. Put values 10 and 15 in cells C1 and C2, select both cells and autofill the column down till the cell C7. Go to Tools/Data Analysis/Histogram; input range A1:A30; bin range C1:C7; specify the output range E1:F8.

Bins are threshold values that separate value classes. For graphs is better to specify middle values for each class. Thus, edit the first column in the output by putting the number 8 instead of 10, 13 instead of 15. Then autofill this arithmetic series. Select the output block of cells (don't include the last line), and plot the histogram.

19. Estimate correlation matrix (Tools/Data Analysis/Correlation). Then make a single-factor ANOVA. What is SS (sum of squares) within groups, between groups and total? Why there is 1 degree of freedom (df) between groups and 58 df within groups? What is mean square error (MS)? What is F and Fcrit? What is P-value? Is the difference in means statistically significant? If the difference was non-significant, would it mean that mean values are equal? Make a 2-factor ANOVA without replication. What does it mean? When we apply this kind of analysis?

19. Make regression analysis. Check boxes: confidence level, residuals, standardized residuals, residual plots, line fit plots, normal probability plots. What is R-square, MS, F, P? Explore all output tables and graphs. Explain why we may need all this information. What are standard residuals? How to find outlier points in regression analysis.

Notes:

  • You can do regression with multiple variables. For example, you may have 1 column of Y-values and 3 columns of X-values. But it is important that all X-value columns are adjacent to each other.
  • Excel does not do advanced statistics, e.g. >2-factor ANOVA, ANCOVA, GLM. To do these analyses use either SAS or MINITAB for Windows.


Alexei Sharov 12/4/98