Thing

DISCLAIMER


Welcome! The goal of this blog is to share my analysis of the free, publicly available user-reported law school applicant data from Law School Numbers. Using the data from Law School Numbers is problematic for a variety of reasons (such as users misreporting their actual information, users creating fake accounts, selection bias, etc.) and if I had access to it, I'd much rather work with the data that schools themselves have on applicants. We have what we have, though. Also, while I do have some facility with the type of statistical analysis I employ in my blog posts, I am far from being a professional statistician. I am doing this solely for the purpose of providing my analysis to interested readers, getting feedback, and generating discussion. What I am not doing is prescribing courses of action for law school applicants, or pretending to actually know what goes on behind closed doors in law school admission committees' meetings. I am, however, interested in looking at the story the numbers seem to portray, and sharing that with people with similar interests. I think I'll be able to provide a lot of interesting, and perhaps even helpful, analysis here, but at the end of the day, it is up to the individual law school applicant to put together applications and application strategies tailored to his or her own hopes and goals.

Wednesday, May 29, 2013

University of California Berkeley Profile

I am sorting through the data I have, going through schools alphabetically. It's going to be a long project, but I'm making progress day by day. For now, though, by request, I am posting a school-profile for the numbers crunching I did on the UC Berkeley data available from Law School Numbers.  Again, there are a variety of important factors to keep in mind when considering this analysis, including the possibility that some of that datapoints are completely bogus, and that the LSN-users might be skewed a little towards the top-end of applicants. Given the number of datapoints we have, though, and that I've done at least a little pre-emptive data cleaning, I think this is worth a look. The first table I present is the results of an ordered logistic regression, which allows me to use folks who reported being acceptances, rejections, and waitlists. Eventually I'll get around to creating descriptions for all of these types of analysis so I can just say "click here to see what I mean by this," but for now, let me just explain it. In an ordered logistic regression, the dependent variable is a categorical variable, and in our case I have coded an acceptance as a 2, a waitlist as a 1, and a rejection as a 0 (in descending order of desirability, although I know plenty of waitlisted candidates who cry that they'd rather just be put out of their misery with a rejection than suffer in purgatory). The data you see below gives the percentage-increase in the likelihood, other variables controlled for, of being either:
      • Accepted rather than (combined waitlisted/rejected)
      • (Combined accepted/waitlisted) rather than rejected
I know, I know...it's a little confusing.  By way of example: in this model for Berkeley, for two otherwise identical candidates, an additional point on the LSAT increases an applicant's chances of being accepted rather than either waitlisted or rejected by 29.3% (in other words, a 170 is 29.3% more likely than a 169 to get accepted rather than waitlisted or rejected , all else being equal).  A 170 is also 29.3% more likely to get either accepted or waitlisted than rejected than is a 169, all else equal.  If you have any questions, just e-mail me or something, and I'll try to explain better.  Or, check out this link to the awesome UCLA stats site. 

In any case, here are the results from this first model, in which I test the impact of LSAT score, GPA, each earlier month the application is sent (and by earlier month, I don't mean month earlier...I mean September vs. October, or October vs. November), URM status, non-traditional status, and female applicant status.  Since Berkeley doesn't have a binding-ED option, I left it off.  Everything is based on LSN data from the 2003/2004 cycle to the present.

                              

The thing that really stands out to me here, although it may be hard to see if you haven't yet seen what these numbers look like for other schools, is the emphasis placed on GPA vs. LSAT.  At least judging by the schools I have looked at so far, Berkeley gives a lot more weight to the GPA vis-a-vis the LSAT than other schools do.  That boost for a .10-point increase in GPA is massive, and the LSAT boost is relatively small compared to most schools.  The boost for each month earlier the application is sent is also very substantial - in fact, many schools seem to give no boost for this.  The URM boost is pretty substantial, too, but one thing you have to keep in mind when interpreting this is that there are almost certainly different "floors" for LSAT and GPA for URM applicants than non-URM applicants.  This matters because, while the "boost" indicates that, all else equal, a URM is almost nine times as likely to get in as a non-URM applicant, this is inflated a bit because below the numbers "floors" for non-URM applicants, a URM is pretty much infinitely more likely to get in.  The increases for non-traditional applicants and female applicants surprised me a little, too.  The final numbers - URM equivalents in LSAT and GPA points - is simply the number of extra LSAT points a non-URM candidate would have to have to a boost equivalent to that of URM status.  It's an interesting way to look at how much the "URM bump" is really worth.

The next model excludes waitlists, including only applicants that reported either being accepted or rejected (whether that was directly, or after first being waitlisted).  The results in the table should be interpreted in the same way, but the interpretation is a little easier.  The number given for each variable is simply the increase in likelihood of being accepted rather than rejected.

                              

Not a whole lot of difference between the two models, although there are slightly bigger boosts for most of the factors if we just consider acceptances vs. rejections, without considering waitlists (which are problematic both for yield-protection reasons - more prevalent at some schools than others, to be sure - and because a lot of "waitlist" profiles belong to people who didn't bother to update with a final status, so we can't just treat them as rejections).

Normally, the next thing I'd do here is take a look at how different factors influence scholarship awards, but because the number of observations for Berkeley is so low, and because it seems like a couple of datapoints really throw the whole thing off due to the small sample size, I'm going to leave that out.  If you're really interested, and promise to not read too much into it, e-mail me and I'll let you know.

Last, I'm including a table that breaks down how non-splitters, splitters, and reverse-splitters are represented in the data.  This one you really have to be careful with, because the data on LSN does skew towards higher-caliber applicants, and so acceptances are more highly represented than they are in the applicant pool.  Really, the value of this kind of thing will become more clear when we can compare schools, because that same "higher-caliber" applicant caveat will apply across the board, so we can probably draw somewhat valid conclusions by comparing schools.  For now, I'll include it for interested parties, but please do not look at this and say to yourself, "Self, as a splitter I have an X% chance of getting into Boalt!"  Promise?  Ok!



So there you have it.  I'm interested in thoughts anyone has.  For me, the real takeaways of this entire thing is that it pays big to apply to Berkeley as early as you possibly can, and that it's a pretty friendly place for non-traditional students and female applicants.  Also, the relative weight Berkeley gives to the LSAT and GPA is different from what we usually see, so if your LSAT isn't up to snuff but you've got a stellar GPA, it might be worth throwing a hail-mary Berkeley's way.

2 comments:

  1. Hi! Thanks for this AMAZING analysis!

    Could you please define splitter and reverse-splitter? I know it has to do with having a much higher GPA or LSAT than the other, but don't know which is which. Thanks!

    ReplyDelete
  2. Hey, thanks for reading! Yeah, I plan on making a page to define all this stuff, and explain how I coded it, so I appreciate you bringing this up. Splitters are the high-LSAT/low-GPA applicants, and reverse-splitters are the high-GPA/low-LSAT applicants. It's interesting to look at this, because some schools are traditionally considered "splitter friendly" or "reverse splitter friendly." I'm interested in seeing if the numbers bear this out!

    For my purposes, I have defined splitters as anyone with an LSAT score above the school's 75th percentile in the previous year, but a GPA below the school's 25th percentile in the previous year. I define reverse-splitters similarly, but with above 75th GPA and below 25th LSAT. This is an admittedly arbitrary way to determine this, but it's at least consistent, but seems to be fairly common. Loosening or tightening the restrictions might create different results, definitely. I should also add that my "splitter data" only goes back to the 2006/2007 application cycle, because I couldn't find the 25ths and 75ths for schools prior to that cycle.

    ReplyDelete