The Lonely Journey: December 2009

Tuesday, December 22, 2009

Independent Samples T-Test and Levene's Test

What it does:

The Independent Samples T Test compares the mean scores of two groups on a given variable.

Where to find it:

Under the Analyze menu, choose Compare Means, the Independent Samples T Test. Move your dependent variable into the box marked "Test Variable." Move your independent variable into the box marked "Grouping Variable." Click on the box marked "Define Groups" and specify the value labels of the two groups you wish to compare.

Assumptions:

-The dependent variable is normally distributed. You can check for normal distribution with a Q-Q plot.
-The two groups have approximately equal variance on the dependent variable. You can check this by looking at the Levene's Test. See below.
-The two groups are independent of one another.

Hypotheses:

Null: The means of the two groups are not significantly different.
Alternate: The means of the two groups are significantly different.

SPSS Output

Following is a sample output of an independent samples T test. We compared the mean blood pressure of patients who received a new drug treatment vs. those who received a placebo (a sugar pill).

First, we see the descriptive statistics for the two groups. We see that the mean for the "New Drug" group is higher than that of the "Placebo" group. That is, people who received the new drug have, on average, higher blood pressure than those who took the placebo.

Next, we see the Levene's Test for Equality of Variances. This tells us if we have met our second assumption (the two groups have approximately equal variance on the dependent variable). If the Levene's Test is significant (the value under "Sig." is less than .05), the two variances are significantly different. If it is not significant (Sig. is greater than .05), the two variances are not significantly different; that is, the two variances are approximately equal. If the Levene's test is not significant, we have met our second assumption. Here, we see that the significance is .448, which is greater than .05. We can assume that the variances are approximately equal.

Finally, we see the results of the Independent Samples T Test. Read the TOP line if the variances are approximately equal. Read the BOTTOM line if the variances are not equal. Based on the results of our Levene's test, we know that we have approximately equal variance, so we will read the top line.

Our T value is 3.796.

We have 10 degrees of freedom.

There is a significant difference between the two groups (the significance is less than .05).

Therefore, we can say that there is a significant difference between the New Drug and Placebo groups. People who took the new drug had significantly higher blood pressure than those who took the placebo.

Source: http://www.wellesley.edu/Psychology/Psych205/indepttest.html

Friday, December 18, 2009

A suggested thesis structure

The list of contents and chapter headings below is appropriate for some theses. In some cases, one or two of them may be irrelevant. Results and Discussion are usually combined in several chapters of a thesis. Think about the plan of chapters and decide what is best to report your work. Then make a list, in point form, of what will go in each chapter. Try to make this rather detailed, so that you end up with a list of points that corresponds to subsections or even to the paragraphs of your thesis. At this stage, think hard about the logic of the presentation: within chapters, it is often possible to present the ideas in different order, and not all arrangements will be equally easy to follow. If you make a plan of each chapter and section before you sit down to write, the result will probably be clearer and easier to read. It will also be easier to write.

Copyright waiver

Your institution may have a form for this (UNSW does). In any case, this standard page gives the university library the right to publish the work, possibly by microfilm or other medium. (At UNSW, the Postgraduate Student Office will give you a thesis pack with various guide-lines and rules about thesis format. Make sure that you consult that for its formal requirements, as well as this rather informal guide.)

Declaration

Check the wording required by your institution, and whether there is a standard form. Many universities require something like: "I hereby declare that this submission is my own work and that, to the best of my knowledge and belief, it contains no material previously published or written by another person nor material which to a substantial extent has been accepted for the award of any other degree or diploma of the university or other institute of higher learning, except where due acknowledgment has been made in the text. (signature/name/date)"

Title page

This may vary among institutions, but as an example: Title/author/"A thesis submitted for the degree of Doctor of Philosophy in the Faculty of Science/The University of New South Wales"/date.

Abstract

Of all your thesis, this part will be the most widely published and most read because it will be published in Dissertation Abstracts International. It is best written towards the end, but not at the very last minute because you will probably need several drafts. It should be a distillation of the thesis: a concise description of the problem(s) addressed, your method of solving it/them, your results and conclusions. An abstract must be self-contained. Usually they do not contain references. When a reference is necessary, its details should be included in the text of the abstract. Check the word limit. Remember: even though it appears at the beginning, an abstract is not an introduction. It is a résumé of your thesis.

Acknowledgments

Most thesis authors put in a page of thanks to those who have helped them in matters scientific, and also indirectly by providing such essentials as food, education, genes, money, help, advice, friendship etc. If any of your work is collaborative, you should make it quite clear who did which sections.

Table of contents

The introduction starts on page 1, the earlier pages should have roman numerals. It helps to have the subheadings of each chapter, as well as the chapter titles. Remember that the thesis may be used as a reference in the lab, so it helps to be able to find things easily.

Introduction

What is the topic and why is it important? State the problem(s) as simply as you can. Remember that you have been working on this project for a few years, so you will be very close to it. Try to step back mentally and take a broader view of the problem. How does it fit into the broader world of your discipline?

Especially in the introduction, do not overestimate the reader's familiarity with your topic. You are writing for researchers in the general area, but not all of them need be specialists in your particular topic. It may help to imagine such a person---think of some researcher whom you might have met at a conference for your subject, but who was working in a different area. S/he is intelligent, has the same general background, but knows little of the literature or tricks that apply to your particular topic.

The introduction should be interesting. If you bore the reader here, then you are unlikely to revive his/her interest in the materials and methods section. For the first paragraph or two, tradition permits prose that is less dry than the scientific norm. If want to wax lyrical about your topic, here is the place to do it. Try to make the reader want to read the heavy bundle that has arrived uninvited on his/her desk. Go to the library and read several thesis introductions. Did any make you want to read on? Which ones were boring?

This section might go through several drafts to make it read well and logically, while keeping it short. For this section, I think that it is a good idea to ask someone who is not a specialist to read it and to comment. Is it an adequate introduction? Is it easy to follow? There is an argument for writing this section---or least making a major revision of it---towards the end of the thesis writing. Your introduction should tell where the thesis is going, and this may become clearer during the writing.

Literature review

Where did the problem come from? What is already known about this problem? What other methods have been tried to solve it?

Ideally, you will already have much of the hard work done, if you have been keeping up with the literature as you vowed to do three years ago, and if you have made notes about important papers over the years. If you have summarised those papers, then you have some good starting points for the review.

If you didn't keep your literature notes up to date, you can still do something useful: pass on the following advice to any beginning PhD students in your lab and tell them how useful this would have been to you. When you start reading about a topic, you should open a spread sheet file, or at least a word processor file, for your literature review. Of course you write down the title, authors, year, volume and pages. But you also write a summary (anything from a couple of sentences to a couple of pages, depending on the relevance). In other columns of the spread sheet, you can add key words (your own and theirs) and comments about its importance, relevance to you and its quality.

How many papers? How relevant do they have to be before you include them? Well, that is a matter of judgement. On the order of a hundred is reasonable, but it will depend on the field. You are the world expert on the (narrow) topic of your thesis: you must demonstrate this.

A political point: make sure that you do not omit relevant papers by researchers who are like to be your examiners, or by potential employers to whom you might be sending the thesis in the next year or two.

Middle chapters

In some theses, the middle chapters are the journal articles of which the student was major author. There are several disadvantages to this format.

One is that a thesis is both allowed and expected to have more detail than a journal article. For journal articles, one usually has to reduce the number of figures. In many cases, all of the interesting and relevant data can go in the thesis, and not just those which appeared in the journal. The degree of experimental detail is usually greater in a thesis. Relatively often a researcher requests a thesis in order to obtain more detail about how a study was performed.

Another disadvantage is that your journal articles may have some common material in the introduction and the "Materials and Methods" sections.

The exact structure in the middle chapters will vary among theses. In some theses, it is necessary to establish some theory, to describe the experimental techniques, then to report what was done on several different problems or different stages of the problem, and then finally to present a model or a new theory based on the new work. For such a thesis, the chapter headings might be: Theory, Materials and Methods, {first problem}, {second problem}, {third problem}, {proposed theory/model} and then the conclusion chapter. For other theses, it might be appropriate to discuss different techniques in different chapters, rather than to have a single Materials and Methods chapter.

Here follow some comments on the elements Materials and Methods, Theory, Results and discussion which may or may not correspond to thesis chapters.

Materials and Methods

This varies enormously from thesis to thesis, and may be absent in theoretical theses. It should be possible for a competent researcher to reproduce exactly what you have done by following your description. There is a good chance that this test will be applied: sometime after you have left, another researcher will want to do a similar experiment either with your gear, or on a new set-up in a foreign country. Please write for the benefit of that researcher.

In some theses, particularly multi-disciplinary or developmental ones, there may be more than one such chapter. In this case, the different disciplines should be indicated in the chapter titles.

Theory

When you are reporting theoretical work that is not original, you will usually need to include sufficient material to allow the reader to understand the arguments used and their physical bases. Sometimes you will be able to present the theory ab initio, but you should not reproduce two pages of algebra that the reader could find in a standard text. Do not include theory that you are not going to relate to the work you have done.

When writing this section, concentrate at least as much on the physical arguments as on the equations. What do the equations mean? What are the important cases?

When you are reporting your own theoretical work, you must include rather more detail, but you should consider moving lengthy derivations to appendices. Think too about the order and style of presentation: the order in which you did the work may not be the clearest presentation.

Suspense is not necessary in reporting science: you should tell the reader where you are going before you start.

Results and discussion

The results and discussion are very often combined in theses. This is sensible because of the length of a thesis: you may have several chapters of results and, if you wait till they are all presented before you begin discussion, the reader may have difficulty remembering what you are talking about. The division of Results and Discussion material into chapters is usually best done according to subject matter.

Make sure that you have described the conditions which obtained for each set of results. What was held constant? What were the other relevant parameters? Make sure too that you have used appropriate statistical analyses. Where applicable, show measurement errors and standard errors on the graphs. Use appropriate statistical tests.

Take care plotting graphs. The origin and intercepts are often important so, unless the ranges of your data make it impractical, the zeros of one or both scales should usually appear on the graph. You should show error bars on the data, unless the errors are very small. For single measurements, the bars should be your best estimate of the experimental errors in each coordinate. For multiple measurements these should include the standard error in the data. The errors in different data are often different, so, where this is the case, regressions and fits should be weighted (i.e. they should minimize the sum of squares of the differences weighted inversely as the size of the errors.) (A common failing in many simple software packages that draw graphs and do regressions is that they do not treat errors adequately. UNSW student Mike Johnston has written a plotting routine that plots data with error bars and performs weighted least square regressions. It is at http://www.phys.unsw.edu.au/3rdyearlab/graphing/graph.html). You can just 'paste' your data into the input and it generates a .ps file of the graph.

In most cases, your results need discussion. What do they mean? How do they fit into the existing body of knowledge? Are they consistent with current theories? Do they give new insights? Do they suggest new theories or mechanisms?

Try to distance yourself from your usual perspective and look at your work. Do not just ask yourself what it means in terms of the orthodoxy of your own research group, but also how other people in the field might see it. Does it have any implications that do not relate to the questions that you set out to answer?

Final chapter, references and appendices

Conclusions and suggestions for further work

Your abstract should include your conclusions in very brief form, because it must also include some other material. A summary of conclusions is usually longer than the final section of the abstract, and you have the space to be more explicit and more careful with qualifications. You might find it helpful to put your conclusions in point form.

It is often the case with scientific investigations that more questions than answers are produced. Does your work suggest any interesting further avenues? Are there ways in which your work could be improved by future workers? What are the practical implications of your work?

This chapter should usually be reasonably short---a few pages perhaps. As with the introduction, I think that it is a good idea to ask someone who is not a specialist to read this section and to comment.

References (See also under literature review)

It is tempting to omit the titles of the articles cited, and the university allows this, but think of all the times when you have seen a reference in a paper and gone to look it up only to find that it was not helpful after all.

Should you reference web sites and, if so, how? If you cite a journal article or book, the reader can go to a library and check that the cited document and check whether or not it says what you say it did. A web site may disappear, and it may have been updated or changed completely. So references to the web are usually less satisfactory. Nevertheless, there are some very useful and authoritative sources. So, if the rules of your institution permit it, it may be appropriate to cite web sites. (Be cautious, and don't overuse such citations. In particular, don't use a web citation where you could reasonably use a "hard" citation. Remember that your examiners are likely to be older and more conservative.) You should give the URL and also the date you downloaded it. If there is a date on the site itself (last updated on .....) you should included that, too.

Appendices

If there is material that should be in the thesis but which would break up the flow or bore the reader unbearably, include it as an appendix. Some things which are typically included in appendices are: important and original computer programs, data files that are too large to be represented simply in the results chapters, pictures or diagrams of results which are not important enough to keep in the main text.

Source: http://phys.unsw.edu.au/~jw/thesis.html

Thursday, December 17, 2009

ISODEL 2009 - Parallel Session

The ISODEL 2009 conference which was held in Jogyakarta from 8 Dec to 11 Dec at the Sheraton Hotel, witnessed 55 speakers presenting their papers.

Prof. Mansor during his slide presentation

Among the speakers are Senior VP of OUM ,Prof. Dr. Mansor Fadzil, Richard Ng (Director of OUM Perak Learning Centre), Tuan Fatma, Rozeman (Senior Lecturers of OUM) and Mohd Jamaluddin (Head of Counseling Unit).

Presentation by Richard Ng

Prof. Mansor's session was slotted on the second parallel session from 3pm - 4.30pm. For the benefit of readers, I have posted the video clips of Prof. Mansor's presentation in 2 parts as follows:

Video clip part 1:

Video clip part 2:

The following are video clips in three parts of the presentation by Richard Ng:

Video clip part 1:

Video clip part 2:

Video clip part 3:

Wednesday, December 16, 2009

ISODEL 2009 - Plenary session by YBhg Tan Sri Anuwar Ali

YBhg Prof. Emeritus Tan Sri Anuwar Ali, President and Vice Chancellor of OUM was among the plenary speakers on Day 3 of the ISODEL 2009 in Jogyakarta.

Plenary speakers during the ISODEL 2009 in Jogyakarta

Other speakers include Fasli Djalal (Director General of Higher Education, MONE, Indonesia), Tian Belawati (President of AAOU) and Ronald Perkinson (Sampoerna School of Education).

The topic of YBhg Tan Sri's presentation was "International Outreach in Open, Distance E-Learning: The experience of Open University Malaysia". The session began with a short presentation of the speaker. See video clip Part 1 below.

For the benefit of the readers of this blog, I have posted video clips of YBhg Tan Sri's presentation here in four parts:

Part 1:

Part 2:

Part 3:

Part 4:

Part 5 - Question and Answer session:

Tuesday, December 1, 2009

Structural Equation Modeling

A Conceptual Overview

Structural Equation Modeling is a very general, very powerful multivariate analysis technique that includes specialized versions of a number of other analysis methods as special cases. We will assume that you are familiar with the basic logic of statistical reasoning as described in Elementary Concepts. Moreover, we will also assume that you are familiar with the concepts of variance, covariance, and correlation; if not, we advise that you read the Basic Statistics section at this point. Although it is not absolutely necessary, it is highly desirable that you have some background in factor analysis before attempting to use structural modeling.

Major applications of structural equation modeling include:

1. causal modeling, or path analysis, which hypothesizes causal relationships among variables and tests the causal models with a linear equation system. Causal models can involve either manifest variables, latent variables, or both;
2. confirmatory factor analysis, an extension of factor analysis in which specific hypotheses about the structure of the factor loadings and intercorrelations are tested;
3. second order factor analysis, a variation of factor analysis in which the correlation matrix of the common factors is itself factor analyzed to provide second order factors;
4. regression models, an extension of linear regression analysis in which regression weights may be constrained to be equal to each other, or to specified numerical values;
5. covariance structure models, which hypothesize that a covariance matrix has a particular form. For example, you can test the hypothesis that a set of variables all have equal variances with this procedure;
6. correlation structure models, which hypothesize that a correlation matrix has a particular form. A classic example is the hypothesis that the correlation matrix has the structure of a circumplex (Guttman, 1954; Wiggins, Steiger, & Gaelick, 1981).

Many different kinds of models fall into each of the above categories, so structural modeling as an enterprise is very difficult to characterize.

Most structural equation models can be expressed as path diagrams. Consequently even beginners to structural modeling can perform complicated analyses with a minimum of training.
To index

The Basic Idea Behind Structural Modeling

One of the fundamental ideas taught in intermediate applied statistics courses is the effect of additive and multiplicative transformations on a list of numbers. Students are taught that, if you multiply every number in a list by some constant K, you multiply the mean of the numbers by K. Similarly, you multiply the standard deviation by the absolute value of K.

For example, suppose you have the list of numbers 1,2,3. These numbers have a mean of 2 and a standard deviation of 1. Now, suppose you were to take these 3 numbers and multiply them by 4. Then the mean would become 8, and the standard deviation would become 4, the variance thus 16.

The point is, if you have a set of numbers X related to another set of numbers Y by the equation Y = 4X, then the variance of Y must be 16 times that of X, so you can test the hypothesis that Y and X are related by the equation Y = 4X indirectly by comparing the variances of the Y and X variables.

This idea generalizes, in various ways, to several variables inter-related by a group of linear equations. The rules become more complex, the calculations more difficult, but the basic message remains the same -- you can test whether variables are interrelated through a set of linear relationships by examining the variances and covariances of the variables.

Statisticians have developed procedures for testing whether a set of variances and covariances in a covariance matrix fits a specified structure. The way structural modeling works is as follows:

1. You state the way that you believe the variables are inter-related, often with the use of a path diagram.
2. You work out, via some complex internal rules, what the implications of this are for the variances and covariances of the variables.
3. You test whether the variances and covariances fit this model of them.
4. Results of the statistical testing, and also parameter estimates and standard errors for the numerical coefficients in the linear equations are reported.
5. On the basis of this information, you decide whether the model seems like a good fit to your data.

There are some important, and very basic logical points to remember about this process. First, although the mathematical machinery required to perform structural equations modeling is extremely complicated, the basic logic is embodied in the above 5 steps. Below, we diagram the process.

Second, we must remember that it is unreasonable to expect a structural model to fit perfectly — for a number of reasons. A structural model with linear relations is only an approximation. The world is unlikely to be linear. Indeed, the true relations between variables are probably nonlinear. Moreover, many of the statistical assumptions are somewhat questionable as well. The real question is not so much, "Does the model fit perfectly?" but rather, "Does it fit well enough to be a useful approximation to reality, and a reasonable explanation of the trends in our data?"

Third, we must remember that simply because a model fits the data well does not mean that the model is necessarily correct. One cannot prove that a model is true — to assert this is the fallacy of affirming the consequent. For example, we could say "If Joe is a cat, Joe has hair." However, "Joe has hair" does not imply Joe is a cat. Similarly, we can say that "If a certain causal model is true, it will fit the data." However, the model fitting the data does not necessarily imply the model is the correct one. There may be another model that fits the data equally well.
To index

Structural Equation Modeling and the Path Diagram

Path Diagrams play a fundamental role in structural modeling. Path diagrams are like flowcharts. They show variables interconnected with lines that are used to indicate causal flow.

One can think of a path diagram as a device for showing which variables cause changes in other variables. However, path diagrams need not be thought of strictly in this way. They may also be given a narrower, more specific interpretation.

Consider the classic linear regression equation

Y = aX + e

Any such equation may be represented in a path diagram as follows:

Such diagrams establish a simple isomorphism. All variables in the equation system are placed in the diagram, either in boxes or ovals. Each equation is represented on the diagram as follows: All independent variables (the variables on the right side of an equation) have arrows pointing to the dependent variable. The weighting coefficient is placed above the arrow. The above diagram shows a simple linear equation system and its path diagram representation.

Notice that, besides representing the linear equation relationships with arrows, the diagrams also contain some additional aspects. First, the variances of the independent variables, which we must know in order to test the structural relations model, are shown on the diagrams using curved lines without arrowheads attached. We refer to such lines as wires. Second, some variables are represented in ovals, others in rectangular boxes. Manifest variables are placed in boxes in the path diagram. Latent variables are placed in an oval or circle. For example, the variable E in the above diagram can be thought of as a linear regression residual when Y is predicted from X. Such a residual is not observed directly, but calculated from Y and X, so we treat it as a latent variable and place it in an oval.

The example discussed above is an extremely simple one. Generally, we are interested in testing models that are much more complicated than these. As the equation systems we examine become increasingly complicated, so do the covariance structures they imply. Ultimately, the complexity can become so bewildering that we lose sight of some very basic principles. For one thing the train of reasoning which supports testing causal models with linear structural equations testing has several weak links. The variables may be non-linear. They may be linearly related for reasons unrelated to what we commonly view as causality. The ancient adage, "correlation is not causation" remains true, even if the correlation is complex and multivariate. What causal modeling does allow us to do is examine the extent to which data fail to agree with one reasonably viable consequence of a model of causality. If the linear equations system isomorphic to the path diagram does fit the data well, it is encouraging, but hardly proof of the truth of the causal model.

Although path diagrams can be used to represent causal flow in a system of variables, they need not imply such a causal flow. Such diagrams may be viewed as simply an isomorphic representation of a linear equations system. As such, they can convey linear relationships when no causal relations are assumed. Hence, although one might interpret the diagram in the above figure to mean that "X causes Y," the diagram can also be interpreted as a visual representation of the linear regression relationship between X and Y.

Source: http://www.statsoft.com/TEXTBOOK/stsepath.html