Dr. Kimberly A. Hammond, University of California-Riverside

**RICHARD CARDULLO:** Well, good morning. I should begin by saying that Kim Hammond and I have talked together many times, a few observations right off the start. First of all, this is about one half to one third the size of our normal classroom. We teach about 600 students at a time in our freshman biology class. Second is I'm just going to guess that you're a little bit older than most of our students. Third is you're awake-- I think. Of course, we don't feed our students coffee before we start.

We're going to actually-- as Jim told you earlier, we're a first cohort MSP project, the targeted project. And we're going to actually take Jim's challenge up a little bit and actually challenge some of our own assumptions at the beginning of the project. And as a biologist, share review with you some of the natural history of this longitudinal study that we've already taken.

So, of course, we all start with a central question, and, of course, this looks familiar to most of you, because most of you have a question that looks just like that in your MSP's. It's particularly vague. And when I get attacked by some of my hard science colleagues and they say, “That's a very vague central question,” I say, “Yeah. It's kind of like asking questions about how does increased burning of fossil fuels ultimately lead to global warming?”

We all ask very vague questions. And the trick is to tease it apart to convince others to establish lines of evidence. We have a series of goals in this project. And as with other projects, we took an experimental approach to address the questions by providing professional development to teachers in mathematics, which we're going to call the treatment. And looked at a variety of outcomes as indicators of increased teacher content knowledge. Verification that our strategies, ACTS strategies, were delivered in the classrooms, and evidence that student learning was ultimately improved, the outcomes. These goals have those in mind. And actually, Jimmy's set us up perfectly. Because many of those stories where people were telling Jimmy, “Jimmy, you want to test more and you want to provide more training?” Just insert “Rich,” because we had the same lessons. Or any of our team had those same lessons. And again, one of the key features here, one of the key goals and something that we learned very early on that's number four on our list-- I think number three on Jimmy's--it was absolutely essential. And NSF got this right from the very beginning, to establish an effective partnership that doesn't just include the superintendent or even the principals. That it's the teachers.

We actually brought Jim Hamos very early on when we had our kickoff. And he actually wagged his finger at everybody and said, “You are going to be part of a national dialogue.” And that's what we expect. We expect to see evidence based claims. And we expect that you will all participate. And that was very effective. Or at least later on in the talk we'll see exactly how effective Jim actually was. So, in all of our projects, there are a number of persistent questions that need to be asked.

In order to maximize any chance of seeing an effect, we realized that we would need to not only establish an effective partnership with the stakeholders, but that we would need to constantly evaluate key questions related to the experimental design. This whole notion of mid-course corrections has reared its head a number of times in our particular project.

The models that were chosen in the study, the types of data that would be collected, the methods by which those data were collected, and constantly asking, what constitutes evidence that all of these approaches are valid and reliable? This, of course, is this hallmark of any scientifically based experimental study, especially one that, like any ecological study in biology, is subject to a number of environmental changes, but may change in both time and space.

The first four questions on that list have been discussed at length at the various workshops at this conference. For our project, details about types of professional development and the various assessment tools provided yesterday by Kathy Bocian, and detailed about how school performance indicators in the background of statewide assessment data was given by Mike Bryant. If you were unable to attend these, I urge you to look at these as well as the other many excellent presentations of the Virtual Poster session that's hosted by MSPnet. We really view that as a real powerful tool.

In this talk, we're going to focus really on the last, which Jim, without talking, set us up perfectly as well. Which brings us back to the first MSP conference and what really constitutes evidence in a project like this.

Okay, so our targeted school district has the following characteristics. It looks like a lot of Southern California schools. Jimmy? Our particular school district has-- at the time of study-- had slightly over 20,000 total students, about 11,000 in elementary, 9,000 in secondary, 57% free lunch, 65% are Hispanic, 28% Caucasian, 5% African-American. Twenty-seven percent were limited English proficient students back in 2002. There's about an overall 20% dropout rate, slightly higher in the Hispanic and African-American populations than in the Caucasian. So, not atypical for a lot of the studies, at least in California. When looking at SAT-9 scores.

When we're looking at mean SAT-9 scores in that particular year, you see a trend that most of us have seen, is that generally they're often low. But also there is a general decline over time from fourth grade to 11th grade. We, of course, specifically focused in on mathematics. Our specific project focuses on tracking grades four through eight while providing professional development really for grades four through six.

So, our design was supported by a rather complex logic model, which I have come to appreciate and love because it actually fits in very well with my background as a sort of cell biologist and systems physiologist. With lots of inputs, of course, and feedback. But I'm going to just propose for the scientists in the crowd today a very simple hypothesis, okay? And that's this-- and again, many of us use similar hypotheses. That participation by teachers in our professional development, ACTS professional development, will lead to measurable changes in content knowledge by teachers. That translates to that actually being delivered into the classroom so that those concepts, content, pedagogy will actually make it there. And ultimately this will lead to some sort of changes in students' performance.

Now, that's an outcome. In our particular case and for many of our cases that's based on someone using standardized tests on mathematics. So I'm presenting you with this simple hypothesis which outlines the important features of our logic model that determines both independent and dependent variables. And that's going to be a theme for the rest of the talk that need to be evaluated for this longitudinal study. Knowing what we would be using student outcomes as our ultimate dependent variable, we're going to show you some of the details of the project. And I'm going to use a backward design philosophy, which many of us use in our research and actually designing our courses when we teach. So those arrows aren't in that order.

We're going to start. I'm just going to give you some details about what we're using. Each one of these needs to be validated, right? So if we're looking at student performance, ultimately whatever we're using as the outcome needs to be validated. So, what are those four?

Before we start there, I just want you to see what our design looks like. Again, we have a relatively small number of schools. The advantage of that is that you might be delusional, like we were at the beginning of the project, thinking that this is something that we might be able to control. Even if you have buy in from the teachers, that over four years, we might be able to control this.

We broke this into a four year longitudinal study where we had treatment and control groups. In this particular case, the treatment groups are receiving professional development. The control groups are the groups that are not. They were cared by similar socio-economic and achievement criteria, demographic criteria. We then performed a coin flip to determine who would get it first. That required going to a variety of groups who became angry when they weren't selected. I was able to convince the school board that, heck, we're scientists. We might be doing damage here. We don't know.

And if you want to see some real data, we can show you some that might suggest that early on we might have been doing that. So, after they heard that they were more afraid, and it's changed. But we already have the letters and the buy in from the teachers, so that was fine. As we started to increase in time, as we included more groups, you, of course, notice that we're losing our control groups. And so we actually, in year three, included a couple of other school districts with similar characteristics which would be the long term control groups. Of course, once we have gone past year four, in theory, this idealized model, we've now included all 16 schools in the district, and we continue to follow progress using a variety of methods.

So, moving upwards. Changes in student performance on standardized tests, and we used a variety of assessments including California Standards Test, supported District Criterion Reference Tests, and the Extended Standards Tests that was developed as a result of our partnership with the target district. And that was one of those examples where they said, “Rich, you want them tested more?”

Well, in fact, we were able to negotiate with the districts that we were actually able to substitute one of their quarterly test with our Extended Standards Test, which is a combination of both multiple choice and free response questions that focused on problem solving strategies and conceptually knowledge in mathematics. In addition to these, we used other student indicators such as the motivational survey. And we're planning to look at course testing patterns of pre- and post-ACTS students in the target district as well.

Our professional development is highly-- this is a very simplified here-- was presented as four modules per year that all focused on different aspects of mathematics content and pedagogy in grades four through six. Each cohort of teachers was enticed to volunteer for all these modules that were deeply rooted in the principles of adding it up. The first module was focused on content with teachers only. And the following three modules were lab school environments. That included both teachers and students.

Now, I should tell you right now that the volunteerism aspect of this project, which you'll hear about in a few minutes, posed some particular problems. Volunteerism meant, of course, that they were obviously free to participate. But they were actually compensated quite well monetarily, thanks to NSF. So they didn't just willingly say, “Yes, we'd like to participate because everything looks great.” They were actually paid pretty well, compensated and given materials to carry out many of these ACTS strategies in the classroom.

Baselines of teacher prior content knowledge and pedagogical strategies were assessed using a variety of different instruments. And the delivery into the classrooms was monitored using trained observers that were blind to whether the teachers were in a control or a treatment group. So lots of different tools were actually used, which creates, of course, lots of different data. The upshot-- so here are some of the things that were used. We used a survey of teaching practices based on Horizon Research instruments. We used both the University of California and California State University Mathematics Diagnostic Placement Test, both for algebra and geometry readiness. And we used the Content Knowledge for Teaching Mathematics, developed by Deborah Ball of University of Michigan.

So, this all brings into question at this point this very last thing, very top of the hypothesis, which is teacher participation. The question becomes important, how reliable and valid is this independent variable? In this case the independent variable is this measure of teacher participation and professional development.

In the end, as an experimentalist, you say, “That experiment is only as good as the establishment of that independent variable.” For instance, the determination of length is only as good as the accuracy and reliability of the ruler that's employed. Or the amount of time it takes to cover some distance is only as good as the accuracy of the timing device that you're using.

The therapeutic or lethal dose of a drug can only be reliably determined if it can be accurately measured. Right? Makes sense. There are many good examples of this in history. And one of my favorites is actually-- hopefully many of you have read this-- Chronicle of the-- Dava Sobel's book Longitude. If you haven't read this you may have read Galileo's Daughter. This is a really good example if you want to get it across to people why it's important to establish the reliable independent variable. You also, if you read the book carefully, you can substitute things like “longitude” for “math/science education” in a couple of different places. It's kind of interesting.

So the whole idea, if you don't know the story, is that while sailors can readily gauge latitude by looking at the height of the sun or guiding stars above the horizon, the measure of longitude actually challenged navigators for centuries. And what that meant was there were lots and lots of shipwrecks. And this became a big problem. Like I said, keep putting “education” in here. The most famous scientists of the day tackled this problem. Galileo, Newton, Halley. And they used their knowledge for what they were mostly famous for, the movement of the celestial bodies, to try to tackle this problem.

Well, they weren't getting anywhere, so this foundation was established. Actually, English Parliament offered a 20,000 pound price to whoever could solve the longitude problem. Now, if you go back and you figure out what's 20,000 pounds in today's dollar, it's about the size of our MSP targeted grant. It's about $5 million dollars. So, to anyone who could solve the problems.

But it was a self-educated inventor named John Harrison who actually figured this problem out because he built a chronometer, a really reliable time piece that was insensitive. It was basically friction free. It was impervious to the pitch and roll that you would see on the ocean, changes in temperature and humidity. And what he was doing was he was actually establishing the reliability of this independent variable to determine longitude.

The work took many decades. It wasn't done in a week. It wasn't done in a year. It wasn't done in ten years. It took most of his life. And even at that, the Board of Longitude, which is a panel of scientists and government officials, didn't believe it would work. And so he was only given a very small amount of the award in the end, even though he solved it.

So, however, his approach obviously ultimately succeeded. It's a great example of how the importance of establishing reliability and independent variable that allowed mariners to navigate treacherous waters safely. So, for MSP-- so there they are. There's Newton, there's Galileo. They failed.

So, for MSP studies such as ours, the establishment of a reliable measure, of what we mean by teacher participation, is an important question that has to be constantly confronted. In our study, that means we have to ask the following questions: how is participation ensured over all four years? In our case, we offer enticements. We established a strong partnership of clear expectations in the first year. And then as you'll find out, we had to come back and do that again.

How do you actually measure participation? Does participation mean that teachers simply show up at professional development activities? What's to keep parents-- It is teachers? Is it schools? Is it the school district? All of these things are obviously important. I know that many of you are looking at dosage effects. But even within that you need to ask serious questions. Can participation be viewed as an all or nothing event? Well, that depends on the subjects used.

And when you really start thinking about it, start asking questions about what's the prior knowledge and experiences of the teachers involved in those professional development programs. And finally, the extrinsic factors are always the fly in the ointment that may affect participation. What happens when factors outside the control of partnership such as district and statewide initiatives, in view of the experimental design itself? Whatever the measure of participation is, it's important to note that all these factors will lead to reporting a central tendency value with some associative error that must be determined each year.

Kim Hammond's now going to show you specific data on teacher participation for the first five years of the study. What she's actually going to be doing in this case-- because she's an ecologist-- is actually showing the natural history of that participation over the first five years.

**KIM HAMMOND:** Okay, so once I figure this out-- Okay. So I'm going to tell you more about-- and I walk around a lot, so I'll try to stay near the microphone. But somebody can do this if you can't hear me.

I'm going to talk about how we measured that participation. And you have to remember that we're really focusing not on the outcomes here, because we really want to validate the independent measure without knowing how much participation we had on the part of our teachers. And you'll see we had a couple of shipwrecks along the way. We can't really talk about the student exposure and we can't really talk about anything else as the dependent variables.

So validating that independent variable is important. So, just like everybody else has talked about, there are some complexities to almost all experimental designs. And this was surely not different from those. These are some of them that we came across, and they're complexities that are common to everybody here. I heard about this a lot yesterday.

The first one is teacher and student mobility in the district. Now, remember, our research design is very volunteer based. So, if somebody can just up and leave. They volunteered, and we lose them completely. And they have really good reasons for doing that. Their administrators moved them around to different grade levels. The teachers, the students move in and out of the district. And so we might lose our volunteers out of our targeted grades. And today I'm going to only talk about grades four through six. So that mobility is important.

The volunteer rate is also affected by other things, extrinsic factors, such as state legislation. Many of you know about Assembly Bill 466 that came online in 2004-2005. And it severely reduced the teacher's ability to participate in our professional development. It also required that teachers have to learn how to do scripted lessons, keeping to the same lesson plan that everybody else in the district was doing at the same time.

So our teachers' willingness to be involved hit a little bit of a low. And then the final problem or the final complexity-- I don't want to call it a problem because sometimes there's a hidden silver lining-- is that there's variables, numbers of participation or professional development given to each teacher. And so we have to start gauging the dosage, not just whether they did or did not have professional development.

So I'm going to walk you through at least the first two years to show you how different a year can be. I'm going to tell you about the total number of teachers eligible. First I'm going to talk just about teachers and I'll get the students in later. The teachers that took our professional development, then those teachers that actually ended up in the classroom because it kind of matters if they go back to the classroom. And then the controls that we had. There's a little, slight dependency on that.

So, in the first year it was the beginning. We have a staged design, remember, so we only had a few schools. We had a total of about 67 possible teachers that could opt to take the professional development. Of those, 40 took it, about 60%. And of those, 36 ended up in the classroom. That's about 90% transfer rate. Now, in that one year I can account for those other four teachers just fine. Three of them were teachers on special assignment. They were meant to roam about the school and help everybody. And the fourth person was a principal. And we don't want principals in the classroom. Just kidding.

So we had a professional development rate of about 60%. We would have preferred 100, but okay. And the data are what they are. And the transfer rate was about 90% with some controls left over. That wasn't a bad start. But then the next year happened, and I just want you to see a completely different design. These are going to be increasing cumulative effects.

And so you're going to see more and more, necessarily more and more teachers that were available for professional development. So 122 available. Only 60 took our professional development. But this is the year AB 466 happened. There was minor running around. The sky is falling. What do we do now? Shipwreck.

Everybody, including us, were worried about-- Of those 60, the teacher mobility issue really increases, because now teachers are being shifted all over the place, different grades. And we only got about 35 of those teachers into the classroom, or only about 58% of the people that we invested the time in.

Now, we did have to stop and change the design. Some of these people will come back into the classroom in subsequent years. So it's not really a waste. It appeared a little bit scary at first. So this legislation, requires these teachers to take a lot more outside professional development.

So what we did was we went to the district and we said, okay, this is a partnership between you guys and us in a similar fashion to what Jimmy was talking about. Is there any way you could help us to, one, reduce the mobility? And, two, can we use our professional development to help augment or replace some of the professional development required by the state legislation?

And there was a way to do that, and they were very good about accommodating us. So we really were able to benefit from that good partnership. So in the remaining years, the next year was a bit-- seems a bit worse. But now this is when AB 466 was actually implemented. It came online in 2004-2005, and 2005-2006 our school district became the program improvement district.

So a little bit less participation-- I mean a little bit more participation and professional development, but a little bit less into the classroom. But beyond that, because the district came online with us, we had a very rapid increase again. And we tried to-- we really stemmed the bleeding, as we say.

We actually were able to improve the participation quite a bit. Now we're also in the out year now, and we're going to continue to do this in the out years. And what we've found is that we're actually getting a much better participation as the teachers all get back into the classroom. And so we can measure the amount of exposure students get by the participation the teachers had originally.

So, overall, the big lesson-- so, allow me to make a messy slide messier for you. The big lesson was we had a low participation by the teachers in professional development. Or low relative to the 100% we wanted. But it's fairly consistent. That's the cost of having a volunteer based participation. You don't get everybody.And then we also had a relatively variable and then recuperating transfer rate of that training into the classroom. That's the benefit of a good partnership, actually being able to go to the district, make some changes and increase the participation or the transfer back into the classroom. And that's a really good thing.

So maybe you would like to see something about how that affected our experimental design overall. And remember, we started off with this design. That's assuming 100% on all of those categories. But that's not what we got. We got a little bit less than 100% in all those categories. These numbers just reflect what was on the graph. How did that affect the number of classrooms touched or exposed in some way? That's the really important variable. Well, some of these teachers taught in multi-grade classrooms. That means that they may have had a four and a five grade combo.

So, just for the sake of argument I counted fourth grade as different than fifth grade. Therefore, if you are good at adding up the numbers and you say, “There was 36 teachers. How come there's 41 classrooms?” That's why. And as expected, we had a growing number of classrooms exposed. And it starts to level off in the last two years. We hope that that maintains, that constant level maintains or even increases slightly.

And overall, over the first five years of our program or our project, we've actually exposed about 353 classrooms to the professional development. Now, I'm not saying that every teacher gave every bit of information that we had. I'm using “exposed” as a general term for “this is what they could have gotten.”

Now, again, this is the independent variable by which we can measure everything else. And there's one more part to that. Well, that's about 40% of all the classrooms over that period of time. So we learned that overall, teacher participation rate was less than 60%. The teachers left the target grade levels, the school, or the district at a much higher rate than we expected.

And so we maintain an agreed upon-- we suggest, “You need to maintain the agreed upon sample sizes if possible. And you need to have a lot of buy in and assurances that the teachers will not move between the grades.” We really did go out and there and really work. We're a small project. We could get to every teacher and talk about why we thought this was important.

The teachers' willingness to participate, to volunteer, was limited. So we increased the incentives as much as possible for that teacher participation. And part of that was just trying to convince them that this was an important project. And we observed that despite the reduced number of teachers and professional development, there was a growing number of teachers in the classroom over time. But the second part of measuring that variable, that independent variable, is not just, as I said, that they have the treatment or not, but how much they had. Teachers had different types of professional development.

If you remember Rich told you there was four types and they were slightly different. And there are different numbers of hours in each of those types of professional development. Another complexity that has to be put into understanding the independent variable before we can even start talking about results.

So, I have here just a general description of how much professional development teachers took. And the way these bar graphs work is they're from high-low closed graphs of the stock market. And since that's doing me no good right now, it might help to use them here for teacher participation.

The first year was a smaller year. It was the beginning year. Teachers took from 30 to 54 hours with an average of about 52.6, or 53 hours showing in the blue box. And that's the way every bar graph works. Teachers could take the professional development in the next year as well. Hence they can start building on the professional development.

And what we see is over the subsequent years, we increased the maximum number of hours to about up to 168 hours. That six down at the bottom only really is for one teacher who had to opt out. But that was the minimum and the maximum over the course of those years.

What's important about this graph or the lesson for me is that you can probably expect in the context that we did our professional development with all the changes in legislation, about 90 hours maximum from each teacher. And then you add on the professional development they have to take for other sources. It might be 150 hours. So you have to limit that professional development to something that's doable, because they have a lot of other things to do. And just like they say, “More tests?” they say, “More professional development? Are you kidding?” So you do have to be cognizant of their needs that way.

Okay, well I wanted to convince you that this was evenly distributed across grade levels. And so here on the graph I have the number of the years across the x-axis. That's time. And on the y-axis, just the average hours of professional development that went into the classrooms in grades four, five, and six. The error bars are one standard error of the mean. So there's really no statistical difference along the way until we get to the out year. And even then it's only fourth grade that falls off a little bit. But everything else within the treatment period is about equal. They do increase in the number of hours because we increased and we offered more hours. But over across the grade levels, it was fairly even.

Okay, and then now I want to talk a little bit more about the students. What did this have to do with the student exposure, and how does this translate to student exposure? Now I'm moving from at least the ability to talk about the independent variable, start thinking about what's the dependent variable, how it's going to respond?

Student exposure, of course, may be proportional to the numbers of hours of their teachers had of professional development. Because that rate, the professional development rate of the teachers is low, we might expect the students' exposure to be lower than expected. And then one added variable that comes into increasing that is student exposure. Students may get professional development classrooms in one, two, or three years. So their summative exposure is a sum of what they get in any one year, plus what they might get in the next subsequent year, the subsequent years. Assuming that they're being taught the lessons that we taught the teachers.

So, just to give you an idea, so the numbers of students, we started off with about 1,100 students a year for the first two years. That grew as the number of teachers and development grew to about 1,500 and then 2,500 per year for a total so far of about 6,400 students. And for context, maybe for context. That's about 50% of the maximum number of students possible. So because students are exposed one, two, or three times, it may bias these numbers a bit. But it's about 50% of what we could have possibly exposed over the course of those five years.

Like I said, some students received professional development more than once. This gives you a bit of an estimate of how that worked. About 27% of the students received professional development over the course of two years. Now, this is another question we'll be able to ask with our data: did it matter whether they received it more than one year or not? And about 6%, about 300 students, received the professional development three times over three years. So, every year, four, five, and six, they had a teacher who had taken our professional development classes.

Again , the question is does this make a difference? And, of course, this is a very small sample size, but it's one that we can start exploring. And as I said, we have not analyzed the data yet.

So, in our entire project so far, the theme of this conference is evidence. And what have we learned and what lessons have we learned, and how do we know that? What do we know? So there's three things I'm going to talk about, and then I'm going to tell you what we want to know.

First, it's important to establish an effective and interactive partnership that is ongoing with the district. Without that you can't go back when things go wrong. And everybody's reiterated that time and time again.

If your project does rely on volunteers, it's important to reduce the cost of volunteerism as I'm calling it by making teachers and administrators stakeholders in the project. And that really does matter. Once you explain, once you take the time to explain yourself to the teachers, you get a lot more buy in. And we had cases where the teachers went to the administration and said, “We like this. We want to keep doing this.”

And thirdly, the establishment of a reliable independent variable-- and this is really important-- such as our participation and professional development-- is critical in understanding the validity of the dependent variables or outcomes. It's only going to be as good as the independent variable, or at least our understanding of the independent variable. That's our lessons so far.

What do we still need to know? Well, everything. We haven't even answered any of the big questions yet. Do the levels of student exposure resulting from teacher involvement continue into the years ongoing? Can we keep collecting data for as long as possible? Will it be useful data? How well have they taken-- Have the students learned our key ACTs concepts that we tried to deliver to the student teachers? Well, first of all, how well do the teachers, and then how well did the students learn it? Has our professional development resulted in changes in student outcomes on standardized tests in mathematics? A big one: are there increases, measurable increases, and enrollments in Algebra I by the eighth grade? Because that was the big target. And for future work, or for the future of this particular project into the next couple of years, are there changes in course taking patterns in algebra?

We would hope that more students would be interested in that, and that that pattern would not be held only in algebra but in geometry later on and on into high school mathematics. And with that I am done. Thank you.