melannen | Let's Read A Scientific Paper: Is Social Media Bad For You?

Entry tags:

Let's Read A Scientific Paper: Is Social Media Bad For You?

Specifically, this one:

Melissa G. Hunt, Rachel Marx, Courtney Lipson, and Jordyn Young (2018). No More FOMO: Limiting Social Media Decreases Loneliness and Depression. Journal of Social and Clinical Psychology. free preprint on Researchgate.

For quite awhile I've had a recurring desire to start a blog that was just me reading whatever scientific study was hitting the mainstream media that week, summarizing/analyzing the paper for a layperson's perspective, and then pointing out what the media coverage should have been saying instead. I tried doing this on a tumblr sideblog for awhile; what I always forget is that it takes a long time to do. Even if I pick a simple paper that's available open access or preprint, isn't super long or technical, don't look up any of the citations, and just accept any math I can't figure out, it still takes more free time than I seem to consistently have these days (I say, as I look down my long trail of excessively long journal entries from this week...)

But this paper - about how social media is very bad for your psychological health - hit the media a few weeks ago, and the coverage made me so mad I either had to write this up or stew over it all night silently instead, and it seemed like a topic y'all would be interested in, so here we go, let's do this!

Abstract:
Introduction:
Given the breadth of correlational research linking social media use to worse well-being, we undertook an experimental study to investigate the potential causal role that social media plays in this relationship.
Method:
After a week of baseline monitoring, 143 undergraduates at the University of Pennsylvania were randomly assigned to either limit Facebook, Instagram and Snapchat use to 10 minutes, per platform, per day, or to use social media as usual for three weeks.
Results:
The limited use group showed significant reductions in loneliness and depression over three weeks compared to the control group. Both groups showed significant decreases in anxiety and fear of missing out over baseline, suggesting a benefit of increased self-monitoring.
Discussion:
Our findings strongly suggest that limiting social media use to approximately 30 minutes per day may lead to significant improvement in well-being

Okay, let's go down the actual paper point-by-point, looking at what they're actually doing here. You may want to pull up the paper I have linked above and read along, it's pretty readable as these go, but you should be able to follow along without that.

The experiment was done on 143 disproportionately female undergraduates who were taking psychology classes at an urban Ivy League university, who were iPhone owners and active users of Facebook, Snapchat, and Instagram, and who chose this social media study out of a list of other possible psychology studies to participate in to earn course credit. It was not a study of “humans in general”. This is such a problem in Psychology studies that it’s become almost a joke, but it doesn’t stop researchers from continuing to do it, and most of the other studies they’re citing did the same thing. I can think of half a dozen ways offhand why a study of psychological health and social media use of self-selected current undergraduates at an Ivy League university might not generalize well to other human beings; I bet you can think of even more.

They didn’t test “social media use”, they tested use of the Facebook, Instagram, and Snapchat iPhone apps. This is a very specific subset of social media that, in some cases, we know for a fact have a profit motive that have caused them to specifically (if not intentionally) design their technology to make people feel worse. It’s also specifically use of the mobile apps, which would generally apply to use when someone is away from their computer. I can think of half a dozen ways offhand why a study of mobile use of Snapchat, Instragram, and Facebook might not generalize to all social media use; I bet you can think of even more.

They explicitly point out (good for them!) that nearly all previous studies have been correlational – that is, “less happy people use facebook more” – which doesn’t make it clear whether the unhappiness causes the facebook use or vice versa. They list two previous studies that do try to prove causation, but have issues with both of those studies. I’m not going to dig in enough to read those studies too, but based just on what’s in this paper, I have issues with the FOMO authors’ analyses of them.
- Verduyn et al., 2015 apparently studied passive vs. active engagement on Facebook (reading vs. posting) and found that passive was worse, an observation that backs up my anecdata, but I’m not sure what that says about social media use in general. (I’d particularly like to see passive social media use compared to a) other passive media consumption and b) sitting in a room where other people are talking and you aren’t allowed to join in.) The FOMO authors’ objection, however, is that “that the longer one spends on social media, the more one will be engaging with it in a passive way.” Have they. Have they never lost an entire night to a comment thread before? Do they just… not understand what active use of social media is? Sure, I can lose an hour scrolling Tumblr or Wonkette comments, but I can lose a day in having a discussion or writing a post or taking photos. I'm not even going to think about the amount of time I've spent on this post and I haven't even posted it yet. (Maybe they don’t think “content creation” counts as social media use?)
- Tromholt, 2016 apparently randomly assigned people to completely abstain from Facebook for a week. I agree with most of what they say about this, but then I get to “many users have grown so attached to social media that a long-term intervention requiring complete abstention would be unrealistic.” Dude. People have done scientific studies requiring abstention from food, I think recruiting people to give up Facebook for a few months is doable. Especially given stats you quoted in this same paragraph that half of US adults don’t even use it regularly anyway. Either you’re saying that giving up Facebook would mess up people’s lives so badly that you can’t ethically do it – in which case you’ve already answered your research question about whether it’s a net good – or you’re saying that you don’t think you could get Ivy League undergraduates to do it for course credit, which is another thing entirely.

They chose the "well-being constructs" they were testing based on ones that had been shown in other studies to correlate with social media use. From one side, this is good - it means that they're trying to double-check the results of previous studies. In other ways, this is bad - the "well-being constructs" they're using are subjective self-reported questionnaires; by choosing specific questionnaires that have already been shown to correlate well with social media use, but changing the rest of the experiment, they're basically just double-checking those specific questionnaires. Rather than testing actual well-being, by filtering for questionnaires that they already know get the results they are looking for they may just be testing against certain layout of questionnaire that tends to get certain results from social media users. (Note: self-reported questionnaires in psychology are. not the most rigorously scientifically supported things, in general. They are sometimes the best we've got, but that's often not saying much.)

They tested well-being using seven different factors, all of which were measured using standard questionnaires. I’ll leave most questions about the specific questionnaires to people who know their current psychology more than me, but an important thing to note is that self-reported questionnaires are very vulnerable to biased reporting – that is, the study participants think they know what result they should be getting, so without even realizing they’re doing it, they slightly round up their answers toward that result. This is why blinding is important – trying to make sure the participants don’t know what result they should be getting, in order to reduce the effect of this (this study is not even slightly blinded.)

They tracked whether the students really did limit their usage by having students send them screenshots of their phone’s battery usage by app. This means any usage not via the app on their own phone wasn’t tracked – which, to their credit, they point out as a weakness of the study. They also had a lot of trouble getting the screenshots sent consistently every day, so halfway through the study they changed the rules to weekly, which isn’t great in terms of the study design, and also meant they lost days’ worth of data whenever anybody’s phone battery ran out and cleared the records.

In the first half of the study, they had students do the depression inventory less often than the others, because “it was assumed that depression would not fluctuate much on a week-by-week basis.” Is that… is that a standard assumption in psychology? Because I remember being an undergrad; my perceived level of depression fluctuated wildly based on things like how many hours ago I’d last remembered to make some ramen – and the study authors changed their mind on this halfway through the study and started including the depression inventory a lot more often, so clearly they too realized they were wrong. It worries me that they had such a limited understanding of depression in their study population going into a study about depression that they changed their mind on something this basic halfway through the study.

“We found that baseline depression, loneliness, anxiety, perceived social support, self-esteem, and well-being did not actually correlate with baseline social media use in the week following completing the questionnaires.” – that is, people who used social media before the study started more weren't actually less happy. Which is interesting, because it seems to contradict those previous studies which all claimed it did. The paper proposes that this is because the correlation is not with actual social media use, but with perceived social media use – that is, unhappy people think they use social media more, when in fact they don’t – and that self-reported vs. objectively tested usage doesn’t actually match up all that well. This seems to me like the actual most interesting result in the paper, especially given that their well-being data was all self-reported and might be subject to the exact same bias, but they don’t dwell on it much.

“Baseline Fear of Missing Out, however, did predict more actual social media use.” Okay. You guys. Fear of Missing Out. This is such a buzzwordy thing. It was invented by ad execs in the late 90s to convince people to buy stuff, and has since become something that is basically used entirely for social media scaremongering. Which is to say, Fear of Missing Out has, in large part of late, been defined by “that thing that young people who use a lot of social media experience”, so of course it correlates with social media use. I’m not saying it’s not a valid psych concept, but I’d like to see a lot more about it outside of the context of social media use and/or kids these days before I stake very much on it, and see more of it in the context of other less-buzzwordy psychological factors.

The students were divided into a control group and an experimental group. They don’t seem to provide the numbers for exactly how many students were in each group, but with two groups and a total of 140, that means the number of students who limited their use was only around 70. Because of the changes halfway through the study, the number who limited their use and had their depression scores fully recorded would have been only around 30. This is the level where three or four outliers can drastically change the significance of your results, even if they are far less dramatic as outliers than Spiders Georg (and may not even be obvious as outliers - if a couple of kids in the control group caught the flu and none did in the experimental group, for example, that's within the range of chance for those kinds of numbers, but might have a big effect on your results beyond what your statistics are going to capture.)

They found quite high significance for “loneliness”, which convinced me to look at the actual test they used for loneliness, which was the revised UCLA Loneliness Scale; Russell, Peplau, & Cutrona, 1980. What struck me, when I looked up the questionnaire, is that it asks things like “I have a lot in common with the people around me” or “there are people I can talk to”. Those are the kinds of questions that are not designed to acknowledge the existence of long-distance text-based friendships, and given that these are college undergrads away from home, they are likely to have a lot of long-distance friendships, and higher social media use probably corresponds to putting more effort into those long-distance friendships. (There’s discussion to be had about the quality of internet vs. in-person friendship, but this questionnaire, which was designed when email was barely a thing, is not the way to have it.) A difference between interacting more with local vs. long distance friends leading to differences in answers in the questionnaire about people “around me” could explain the entire difference in the loneliness scale. It wouldn't even require the students to feel like their online friendships were more lonely - because the questionnaire automatically assumes that fewer in-person interactions means more loneliness.

Mind you, they didn’t make it easy to figure out what the actual difference in loneliness score was. They give the p-values and state that it’s “significant” but “significant” in statistics doesn’t mean “large”, it just means “probably not an accident”. It looks from the poorly-labeled graph like there was an (average?) drop of about 6 points out of 80 possible, which is the equivalent of answering about one and a half of the questions differently – certainly within in the range of what might have happened if even a couple of the questions were biased against online interactions. The only other actual data we get for loneliness is “F (1,111) = 6.896” without any context, and I will confess to not knowing what that means without any further explanation.

For depression, they split the results into two groups – “high baseline depression” and “low baseline depression”. They don’t at any point say how many people were in each of these groups. But remember we’re starting with about 70, so if it split evenly, it would be at most 35 in each; it’s unlikely it was split evenly, so now we’re looking at even smaller numbers in each group. Also, remember they changed the way they measured depression halfway through the study, so it’s at most 15 in each group at that point. At that point, just one kid gets the flu (physical illness tends to increase depression scores) and your numbers are off. And at no point do they discuss how they factored the change in methodology into their statistical analysis.

There is a graph for depression. I hate this graph. It appears to directly contradict what is said in the text. If it doesn't directly contradict the text, it is an incredibly bad graph that manages to give the impression it does (well, it's an incredibly bad graph either way.) So I have no idea whether to go by what the text says or what the graph says.

Either way, they do both agree that even the control group had a significant decrease in their depression scores, and the experiment group improved more. That seems to imply that simply taking part in the study reduced depression levels. This makes sense to me – for my personal depression, feeling like I’m accomplishing something and providing a net good to humanity is one of the best ways to improve my mood a little, as was (when I was an undergrad) successfully finishing homework, so getting a little hit of “I did the study!” every day for a month would probably reduce depression scores (not cure it, just reduce the scores a tiny bit), and in the experimental group, knowing that I had succeeded at meeting the study conditions of limiting my use would have helped even more. And given the other result that perceived social media use is more closely correlated to well-being than actual social media use, it could be that just being mindful of the time spent on social media improved mood, regardless of actual changes in use, because people weren’t overestimating time spent and beating themselves up for it. It would be interesting to see if there was a difference here between the people who reported every day and those who reported every week, or between people who were always successful in meeting the conditions and those who weren’t – but of course they don’t break that down at all. Probably because at that point the number of people in each group would be so small it would be impossible to treat it like good data.

They also saw a decline in FOMO and anxiety over the course of the study, but with no significant difference between control and experimental groups, which further backs up the possibility that just participating in the study increased well-being, or the possibility that the results are due to bias in self-reporting due to the study not being blinded (or due to taking the same tests repeatedly changing the way the students responded to the tests), or reversion to the mean (where people are more likely to sign up for a study like this if they’re doing worse than normal, and then get a little closer to normal over the course of the study, because most things to eventually get a little closer to normal.)

You’ll note that the paper’s title mentions FOMO but FOMO was not one of the things that showed a significant result for experimental vs. control. See what I meant about it being a buzzword?

The other three tests showed no significant difference either between experimental and control groups or between baseline and post-experiment. It seems to me that "social support, self-esteem, and well-being are completely unaffected by social media use" is almost a more interesting result, but for some reason, that part didn't show up in their abstract or press release. Wonder why.

Is it time to talk about p-values? I think it is. Almost all of their results are reported mostly as p-values. If you think of experiments as attempts to figure out if something interesting is happening, p-value is basically how most scientists score how interesting it really is.

P-values are relied on for this probably a lot more than they should be, for lots of complicated reasons - the word "bayesian" gets thrown around a lot - but statistics are not my favorite thing, so let's just go with P-values are the standard for now. Explain p-values:
- Flip a coin once, it comes up heads. This is not very interesting. Flip it twice, it’s heads both times, it’s a little bit interesting, but not very, that could just be coincidence. Flip it ten times and it’s heads every time, then either there’s something up with the coin, something up with the way you’re flipping it, or you’re a character in some kind of postmodern theater production and may actually be dead, and regardless, you should probably try to figure it what’s going on.
- The lower the p-value, the more heads in a row you’ve rolled. A p-value of .05 means you’ve rolled about 5 heads in a row – there’s about a one in twenty, or 5%, or .05, chance of that happening purely by coincidence. .05 is what people usually use as the cut-off for interesting (aka significant). A p-value of .001 means you’ve rolled 10 heads in a row, and you’re in Tom Stoppard territory - something is definitely up.
- With a coin flip, we have a pretty good idea of what the results would be if nothing interesting was happening (we would get 50% heads if we tried enough times, and get closer to 50% the more often we tried.) With most things in actual science, we don't really know what the probability of something happening by chance really is - some of the numbers plugged into the statistics are always going to be best guesses (otherwise, we wouldn't need to do the science.) For something like a particle accelerator, most of those best guesses are going to be really good best guesses, based on a lot of previous science, and engineering tests, and very hard numbers. For something involving people,
  not so much.

For loneliness, where there was a 'significant' difference between control and experiment - their p-value was .01, which is somewhere around seven coin-flips coming up heads in a row. For FOMO and anxiety – where the change was around the same in the control group and the experimental group – they get p-values around 9-10 heads in a row. These are pretty interesting results - something is probably happening, even if we disagree about what it actually is.

For depression, they only give the p-value as “<.05” and not the actual value, which seems kind of sketchy, and usually means it was somewhere around .04999.That’s back to about 5 coin flips in a row, or a one-in-twenty chance of it happening by chance.

Remember, they tested seven different things, and at least some of those things they cut up into different categories, so they were basically doing more than one test on them. We know they tested change for high depression groups and low depression groups, in both the experimental and control groups, and tested both the experimental and control groups for six other characters, so that’s up to 16 tests, and we don’t know if they tried to divide up any of the other characters and didn’t see anything interesting so they didn’t report it. But already we’re up to the 1-in-20 chance coming up in one test out of 16 – suddenly, a 1-in-20 chance doesn’t sound that interesting. (see this xkcd comic if I lost you on that one.)

This is why you need to share more data than just p-values. And why you need to replicate by having someone else perform the tests and see if they get the same results - the same 1-in-20 chance coming up twice in a row is suddenly a lot harder to write off. (This kind of psychology study is notorious for not getting the same results twice in a row.)

They wanted to get final follow-up data from the participants several months later. However, only about 20% of them responded, because they’d already earned their course credit and the semester was over (one of many reasons why using undergrads is not always the best choice.) Also, looking at the numbers elsewhere in the papers, it looks like for most of the tests they were only using data from about 110-120 of the 143 undergraduates who signed up. Why? What happened to the other 20-30? Did they drop out of the study, or was their data unusable for some reason, or did they reject it as outliers? Dunno, it’s not in the paper.

ALL THAT SAID - and it was a lot - this is not a particularly bad paper, as papers in this field go. It's probably significantly better than most - it's better than the last few I've read. The fact that I could shoot it full of holes is not so much on the scientists involved as on the field as a whole - on the pressure to publish something, on the fact that papers like this do get published regularly, on the fact that most of the things they did that I objected to are things that most psychology papers do.

Also, the free version of this paper is a pre-print version, which means the full version that's published in the journal might be slightly improved (not very much improved, though, probably, because this version was on the journal website for awhile.) Some preprints are actual rough drafts, and I wouldn't pick them apart this hard - but this one had a press release to go with it, and if you're putting out a press release, you're saying your paper is ready for prime time.

The title and all the press coverage - including in the science blogosphere, which used to know better but has been getting worse and worse - vastly overstated the results and the extent to which the results could be generalized to all people and all social media. They always do this. It's terrible. It builds mistrust and misunderstanding of science, reinforces bad practices in the lab, and creates a space for actual fake science to come in. It needs to stop. I don't know how to stop it.

Actual interesting results from this paper that I would have put in an article about it if I was a science journalist:
- People who are unhappy probably think they are wasting time on social media more than they actually are. (Or, perhaps, happy people think they're using it less.)
- It's possible that the reason studies like this get positive results is that press coverage of studies like this convinces people that if they're unhappy they must be using more social media which means that when they participate in studies like this they report that they use more social media if they're unhappy which means the studies get press coverage and the cycle repeats, because science doesn't happen in an ivory tower.
- Tracking your social media use - or possibly just any sort of small, achievable daily task that feels productive, such as tracking your social media use for a study - seems to make people less unhappy.
- Reducing Facebook and Instagram and Snapchat use on mobile devices among mostly-female ivy league college undergraduates appears to improve their mental health (which, given that Facebook at least was invented in order to make ivy league college girls feel like shit about themselves, is not a surprising result regardless of your opinion of social media in general!) Given the mental health problems being seen among Ivy League college undergrads, this is probably a very useful thing to explore further even if it's kept to that limited scope.

this study apparently claiming that women think with their wombs

Flat | Top-Level Comments Only | Expand All

blood is compulsory, you see

They're all blood.

I mean I sort of feel that "getting young women away from venues that are filled with the disproportionate harassment and targeting of young women makes their mental health improve" is not a particularly controversial or revolutionary finding. ODDLY.

Right? But that is really not the same thing as "all social media is bad for all humans."

ETA: And this study does a fairly bad job at even showing that - anxiety and self-esteem and social support and general well-being weren't apparently affected, using their metrics. Of course, hard to say if "facebook" or "an ivy league university" is, on balance, more toxic toward young women...

Edited 2018-12-12 05:39 (UTC)

(no subject)

peoriapeoriawhereart - 2018-12-12 05:51 (UTC) - Expand

I'm declaring teal deer for now, but may read in greater detail when my spoons aren't all grapefruit (I've no Bucky icon so Blair will have to do.)

WTF?!

Psychology is supposed to be one of the sciences and this wouldn't pass muster (or shouldn't, I don't know what standards 'are' currently) over where social sciences are committed.

I'm glad you read this for us and presented it; I've got people who 'may know more' or will at least have Caustic Eyebrows about it.

This is so very teal dear (it is probably longer than the paper) but that is probably to be expected on my journal at this point!

Psychology's currently at the center of an ongoing debate in some scientific circles about just how crappy a lot of published research is - it has the snappy name of the Replication Crisis. Psychology's not the only field with the problem, but it's got a lot of the most blatant examples. (Unfortunately, this does not seem to have done anything to stop the press from going nuts on any study that appears to confirm existing biases. :/)

(no subject)

peoriapeoriawhereart - 2018-12-12 06:02 (UTC) - Expand

(no subject)

lannamichaels - 2018-12-12 15:29 (UTC) - Expand

(no subject)

peoriapeoriawhereart - 2018-12-13 04:06 (UTC) - Expand

(no subject)

melannen - 2018-12-12 16:05 (UTC) - Expand

(no subject)

peoriapeoriawhereart - 2018-12-13 04:00 (UTC) - Expand

(no subject)

recessional - 2018-12-13 19:15 (UTC) - Expand

(no subject)

peoriapeoriawhereart - 2018-12-14 16:43 (UTC) - Expand

(no subject)

recessional - 2018-12-13 01:49 (UTC) - Expand

(no subject)

peoriapeoriawhereart - 2018-12-13 04:01 (UTC) - Expand

(no subject)

recessional - 2018-12-13 19:12 (UTC) - Expand

(no subject)

peoriapeoriawhereart - 2018-12-14 16:19 (UTC) - Expand

Wow! Thank you for taking the time to go through all this!

It wasn't really a choice because I have a Thing, but I'm glad people are reading it!

This is a bad study and they should feel bad.

I have to consciously not skew my answers in a study. Even in a blind one, I can often figure out what they are trying to do just by the questions they are asking.

What they have told me is that if you have a bunch of people all of the same age, similar life spaces, and similar social class in a compressed geographical area and encourage some of them to limit their screen time, they feel better. Probably because they probably interact more, since they have a high level of opportunity for direct interactions and lots of targets of opportunity for interaction, and so they are in a better mood. (if you are doing depression testing that often, you aren't getting actual depression levels, you are getting mood). There are many studies on increased direct social interactions improving mental health.

I don't think it even has anything to do with interaction increasing. They told people to essentially fast a bit from an indulgence (the widely held assumption is that it is "good" not to be too attached to screens, which these college kids would have been told since childhood screen time limits imposed by their parents) and gave them a feedback method to perform their commitment and virtuousness (rather than having to chronicle their continued indulgence like the control).

I think a control group reporting they ate green vegetables once a day or drank water instead of a soft drink might have shown the same positive mental health effect, and the study may not say anything about social media as such at all.

(no subject)

recessional - 2018-12-12 20:19 (UTC) - Expand

(no subject)

nicki - 2018-12-13 01:43 (UTC) - Expand

(no subject)

melannen - 2018-12-12 16:09 (UTC) - Expand

(no subject)

nicki - 2018-12-13 01:37 (UTC) - Expand

(no subject)

fred_mouse - 2018-12-13 08:55 (UTC) - Expand

They should have had a control group that similarly did a small daily "fasting" kind of lifestyle limit that made them feel vaguely virtuous, and got tracked an reaffirmed. Like asked them to reduce their "sugar intake" or take the stairs more often or reduce "junk" food or drink water instead of a soft drink at their meal or whatever. Following something like that really tends to create a good feeling, that is totally independent from whatever the habit or changing it accomplishes. Just successfully doing the fast gives that accomplishment kick, which makes you feel better about yourself and thus presumably would show up in self reported depression levels.

Ooh, that would definitely have been a better way to control it!

(no subject)

ratcreature - 2018-12-12 17:25 (UTC) - Expand

(no subject)

melannen - 2018-12-12 19:22 (UTC) - Expand

(no subject)

ratcreature - 2018-12-12 19:37 (UTC) - Expand

(no subject)

peoriapeoriawhereart - 2018-12-15 04:09 (UTC) - Expand

(no subject)

genarti - 2018-12-14 02:22 (UTC) - Expand

this is not a particularly bad paper, as papers in this field go. It's probably significantly better than most - it's better than the last few I've read.
This is worrying.

That said, the results you got from it are definitely interesting. And some of it lines up with my personal experience: when depressed I too feel accomplished and thus better when I manage to regularly track something. ...I should start doing that again.

Yeah, I would love to see a study determining if "I did a small useful thing every day for a month" has a measurable effect on depression - it certainly lines up with experience and practical advice I see a lot, but I don't know if there's any experimental data for it, and it seems like a pretty basic and easy thing to test.

(no subject)

recessional - 2018-12-12 20:21 (UTC) - Expand

(no subject)

melannen - 2018-12-12 23:49 (UTC) - Expand

Thank you for this! My field of expertise (thankfully?) is not one that is often reported on in public media, so there's less to be annoyed about in daily life. It's always nice to see someone who knows about the state of the field go through and deconstruct the claims of the article du jour.

given that Facebook at least was invented in order to make ivy league college girls feel like shit about themselves
Huh, this is the first I've heard of this! Do you have additional context?

The book Accidental Billionaires: The Founding of Facebook, a Tale of Sex, Money, Genius, and Betrayal by Ben Mezrich describes this (it's the book that eventually became the movie The Social Network). Basically, Eduardo Saverin and Mark Zuckerberg just wanted to get laid, so they crowdsourced a hotness database of female students.

(It's a terrible book, I don't recommend it, I DNFed it.)

Edited (darn parentheses) 2018-12-12 15:28 (UTC)

(no subject)

melannen - 2018-12-12 15:31 (UTC) - Expand

(no subject)

extrapenguin - 2018-12-12 17:11 (UTC) - Expand

(no subject)

melannen - 2018-12-12 19:18 (UTC) - Expand

(no subject)

melannen - 2018-12-12 15:29 (UTC) - Expand

(no subject)

extrapenguin - 2018-12-12 17:16 (UTC) - Expand

(no subject)

thenewbuzwuzz - 2018-12-12 18:26 (UTC) - Expand

(no subject)

melannen - 2018-12-12 19:17 (UTC) - Expand

I feel like there are a lot of assumptions in this paper*. (Also, wtf, with the not reporting any n or actual differences between things.) Honestly, what I really want to know is why J Soc Clin Psychol is accepting this peer-reviewed without poking for better numbers and more talk about limitations in the discussion section.

*Based on your analysis. I don't have the spoons for reading it straight right now.

It's possible that they did report n or actual differences on a few things but I didn't realize because they did it in shorthand in parentheticals in a way that would only be recognizable to other psych researchers, and didn't analyze (or label) it at all? But they certainly didn't mention n or actual differences much in the text! (It's also possible there's a data supplement in the for-pay version of the paper but I wasn't dropping $30 to find out.)

The paper is probably shorter than my analysis though.

Congratulations, you have just won the opportunity to spend Christmas week commenting on rough drafts of my dissertation chapters! Enjoy!

nooo I can only do it to strangers otherwise it feels mean.

...I will volunteer to turn your diss chapters into limericks though.

The experiment was done on 143 disproportionately female undergraduates who were taking psychology classes at an urban Ivy League university, who were iPhone owners and active users of Facebook, Snapchat, and Instagram, and who chose this social media study out of a list of other possible psychology studies to participate in to earn course credit. It was not a study of “humans in general”. This is such a problem in Psychology studies that it’s become almost a joke, but it doesn’t stop researchers from continuing to do it, and most of the other studies they’re citing did the same thing. I can think of half a dozen ways offhand why a study of psychological health and social media use of self-selected current undergraduates at an Ivy League university might not generalize well to other human beings; I bet you can think of even more.

I wanna complain about this stuff forever, especially since I was in that position. "You have to participate as a requirement of this course, here are things to do", and then you show up, you half-ass it, and then you leave. Or they recruit college students saying we'll give you 20 bucks, and then you half-ass it and you leave with some money to cover the food budget for the next couple days.

So, yeah, great population for your research, even on top of the problems of relying entirely on college students to generalize for humanity.

"Modern social psychology is the study of undergraduate students half-assing things at universities that have social psychology departments" is such a thing that it's practically proverbial. And so obviously bad science, and yet people still do it! All the time! (It's cheap, everybody does it, it gets you results that *sound* interesting, and you gotta publish....) (Child psychology for a long time - in pretty much all the foundational studies that show up in undergrad textbooks - was "the study of children of university employees", although I think that one has improved slightly more over time.)

Every so often I stumble on a psych study that actually makes an effort to recruit from different populations and then compare the results, and it's hard to even read the rest of the paper through the sparkles of joy.

Edited 2018-12-12 15:36 (UTC)

(no subject)

lannamichaels - 2018-12-12 16:01 (UTC) - Expand

(no subject)

melannen - 2018-12-12 19:16 (UTC) - Expand

(no subject)

susanreads - 2018-12-29 16:37 (UTC) - Expand

(no subject)

sylvaine - 2018-12-12 20:44 (UTC) - Expand

(no subject)

recessional - 2018-12-13 01:48 (UTC) - Expand

Lord, reading this made me groan. I do not miss grad school, and library science wasn't even particularly egregious as a field!

I mean, Library Science has its own problems, but bad experimental protocol usually isn't top of the list for them!

(no subject)

hlagol - 2018-12-12 19:37 (UTC) - Expand

(no subject)

recessional - 2018-12-13 01:46 (UTC) - Expand

(no subject)

hlagol - 2018-12-13 17:05 (UTC) - Expand

(no subject)

recessional - 2018-12-13 19:09 (UTC) - Expand

(no subject)

melannen - 2018-12-13 19:17 (UTC) - Expand

(no subject)

recessional - 2018-12-13 19:19 (UTC) - Expand

(no subject)

hlagol - 2018-12-13 20:38 (UTC) - Expand

Wonderful post, thank you!

Glad you liked!

Also I keep thinking there has to be some way to set up a service where someone can, like, pay people like you and me to go through their intended study and point out all the bad assumptions or "citation needed" assumptions they're already making or whatever they're not controlling for, BEFORE they actually go through with it.

That's what peer reviewers are supposed to do! ...of course they don't get paid either. I think via Retraction Watch that a few journals have recently started trying some kind of ombudsman thing that does what peer reviewers do except for pay, but they're still usually someone who's made a career in the same academic discipline and are probably going to miss a lot of the same kinds of things peer reviewers do. I think to really do it right you need somebody who is good on science and stats in general but not a specialist in your discipline and with no horses in your race.

And for a lot of these studies, the problems are baked into the study design, so there wouldn't be a lot you could do at first other than reject a bunch of papers (or incentivize people to do fraud instead.)

I would totally do it if someone would pay me for it, though, yes!

(no subject)

recessional - 2018-12-13 01:42 (UTC) - Expand

Oh goodness, this gives me flashbacks to my own stint at uni in psychology. *scrubs this comment of multiple bitter, sarcastic asides*

This is an excellent analysis of the honestly many, many problem with that study. The one that I am still stuck on (I read this post on the way to school this morning; it is now nearly bedtime) is that they changed their methodology halfway through. I just. *hands* You can't do that!!! You can't do that and not somehow address that change in the data analysis I - *incoherent yelling*

Anyway, I love the idea that you had to dissect a study like this regularly, and I'd definitely be a delighted reader of any subsequent post like this you'd like to make, but I also very much understand that being rather a work-intensive undertaking.

Anyway, *points at icon in complete and utter lack of patience at the entire field of research psychology at this point*

I mean, it wasn't a major methodology change! And maybe they did work it into the data analysis and didn't tell us? ...yeah that doesn't make it better.

And that fact that the methodology change happened to affect the most interesting result really stinks of "quick, this isn't quite coming up significant, we need to tweak something so it does."

143 undergraduates
YOUR SAMPLE SIZE IS SMALL

Your methodology is bad AND YOU SHOULD FEEL BAD

That is one of the few reactions gifs I have been tempted to keep on hand. It shouldn't feel useful as often as it does.

Here via a link and this is awesome and fascinating. Thank you for writing it!

So this was an interesting post for me, because I have a graduate degree in social psychology, and quit after three years as a professor in part because I had come to believe that the research paradigm I was trained in (laboratory studies of group decision making, mostly using undergrads) was meaningless. So I am all for the critiquing! And yet... I don't want people to lose sight of the fact that *something* happened in this study. Something happened, and it was described in enough detail that various people on this post have been able to suggest intriguing alternative hypotheses that future researchers could test. This is still science! It's science that's probably taking the scenic route to usable insights, instead of heading directly there, but this study wasn't pointless.

I'm glad I got out of the field when I did. I support the efforts to replicate classic studies, but from what I've heard (in media reports, because I've lost touch with everyone who's still working in the area) a lot of it has been in a spirit of drama, wank, and back-biting, rather than a spirit of inquiry.

Because here's the thing: high-impact social psychology studies have always required certain skills in the researchers. You're trying to capture a complex human phenomenon and model it in a brief, safe time and space. That modeling is possible *if* you can get the participants to buy into it, and I don't know if anyone ever succeeded in quantifying or describing the contextual qualities that create that buy-in. Ugh, it's getting late and I'm not sure I know where I'm going with this anyway... I think maybe my point is that a failure to replicate doesn't necessarily mean that the original study was just a random fluke - it can just as likely mean that there were other social forces at work in the original study that were not specified in a way that makes them easily replicable... but something still happened that is worth understanding and pursuing, albeit maybe in a different direction.

I completely agree with this, mind you?

It's kind of part of why I have that wistful sense of "like can you just like . . .give me 20$? and I'll run through your setup beforehand and point out all the places where even I as an autodidact with a broad knowledge range can tell you that this study is not actually doing what you think it's doing but is in fact doing something else due to this that and the other bias/perspective/lack of control/thing you appear not to have noticed?"

Sometimes they might even decide that what they were ACTUALLY going to study is useful! It just wasn't what they THOUGHT they were studying! Like that time a bunch of neuropsych people decided to scan the brain of some famous climber because they were in awe of his ability to be totally cool and collected in very dangerous climbing situations . . . . except they decided this meant that he must naturally have some physical abnormality in his brain that meant he Didn't Experience Fear?

. . . . .except that in interviews with him he talks, explicitly, about being unable to watch (say) Game of Thrones because it's upsetting and agitating, and he's CLEARLY anxious in social-dialogue situations, so actually you're not studying a guy who has NO fear response, you're studying a guy who has fear/arousal responses that do not match up with what a LOT of people experience and that in and of itself is fascinating, interesting and valuable! It would be a good thing to study!

It's just, you're not studying a guy "without fear". That's not what's going on here.

Also hooooo boy are you right about the actual attempts to replicate in many high profile cases. (The not-actually-like-that ones don't attract as much attention.)

Thanks, interesting discussion!

As both a psychologist (lapsed) and a statistician (fundamentalist) I alternated between laughter and the horror of 'oh, no, really, has no-one trained bloody psychs out of that *yet*' reactions. Particularly with social media research, which seems to start from the bias of 'healthy people would be Doing More Important Things if they Weren't On Facebook'.

Also -- I too, have contemplated a side blog on reading scientific papers. Although I'm not sure that I could get to this level of rant.
Also, also - that F-test statistic that is reported? I initially read it as bollocks, because I thought the 1,111 was a single number; as it is, I'm assuming that the 1 and the 111 are the degrees of freedom. But really, the psych obsession with putting in their t- and F-values into papers instead of useful things like the mean and 95% confidence intervals on their estimated differences irks me.

While I'm thinking of post-scripts: I have no idea who linked to this discussion, but I'm assuming I followed a link from somewhere.

Ugh, what a messy study! Maybe not 100% useless (though it sounds like it's at least 95% useless without significant further follow-up, but so goes the course of much science, really. "Mixed results, some potentially significant things not tested or controlled for, needs more data and more replication and more analysis" is not inherently bad or anything.) But definitely not worth reporting on as the media apparently has -- which, of course, is nothing new with the media and scientific studies, especially in psych, but ooof.

Edited 2018-12-14 02:21 (UTC)

Thanks, this was interesting. I enjoyed your analysis and not having to read the paper myself. I happen to know that student mood tends to vary during the course of the semester, and there is a marked rise in mood after they pass. I would bet the missing data is from students who dropped the class.
I think the best way to test this is to make a cool mood testing app that people want to use, then give it permission to monitor usage of other apps. Have the person tell in words which media sites they use most often, and use that to monitor social media, because in my circle of knowledge only old folks do most of their surfing on facebook. It should monitor pinterest, tumblr, twitter, discord, etc. it should be distributed to non students. Just place it in the app store with all the permission forms preloaded after download,
Then again, I honestly don’t care what the study says. I find the important thing is limiting access to toxic sites, and not missing too much sleep that has the most affect on mood. Not the old “In my day we had real friendships” kind of moralizing that papers like to report.

Flat | Top-Level Comments Only | Expand All