There were far too many amputees—an unbelievably high number, in fact. And a veritable hoard of deaf teens who claimed to have never visited a dentist. And to be gay. And to be gang members.
Something was amiss within the National Longitudinal Study of Adolescent to Adult Health, one of the largest research projects on adolescents in history. The survey, which followed teens over the course of several years, comprised a cache of statistics that investigators across the social and behavioral sciences could mine for insights into American youth. When the data dropped, researchers swarmed—hungry for randomized population figures they could cram into studies about how adopted children have behavioral issues and LGBT youth are more likely to suffer from depression. Few stepped back to wonder why there were so many gay deaf teens with bad teeth and criminal contacts.
Joseph Cimpian did. An economist at New York University and perhaps the sole expert on the phenomenon of reckless teens ruining scientific studies, Cimpian came at the body of data with Occam’s Razor and started slicing. “So they’re extremely tall, blind and deaf, have several children, and have never been to dentist. And all are miraculously LGBTQ? I’d say that has a low probability of being true,” Cimpian told Fatherly. “It kind of surprises me that people didn’t immediately think, of course some teens are going to lie. Some kids are going to think it’s funny to give extreme, untruthful responses. Remember when you were a teen?”
Survey data is sacred to social scientists, who cannot begin to understand what children and teens experience without asking questions and cannot hope to get honest answers without offering anonymity. Many a psychologist made a career out of plumbing Add Health for information about what makes kids tick. Casting doubt on the data collection method is tantamount to questioning the foundation of thousands of scientific studies and, in a broader sense, the credibility of a field of research with profound impacts on public policy. When scientists like Cimpian started poking holes in the Add Health data, they were also pick a fight. And that fight, over whether or not kids can be trusted, has far-reaching ramifications.
In fact, teen trolls have already taken down important research. One 2000 study that found behavioral problems among adopted teens recanted its results when it became obvious that “the self-identified ‘adoptees’ in the Add Health project sample used in that work were not actually adopted,” according to a 2003 erratum. “Those respondents’ extreme responses exaggerated the apparent differences between adopted and nonadopted adolescents.” Even more unsettling is that no less than three studies drew broad conclusions about the struggles of amputee teens, at least partly based on answers from the 253 adolescents in Add Health who said they used artificial limbs. But later, via in-person interviews revealed that just two of the teens had been telling the truth. A full 99 percent of the alleged sample never existed.
But the most serious debate sparked by the questionable Add Health data concerns LGBTQ youth. A handful of studies used Add Health data to demonstrate that LGBTQ teens are at higher risk of suicide and depression, a finding that guarantees regular funding and social services for this at-risk population. Meanwhile, other studies have called those results into question, claiming that straight teens likely pretended to be gay, depressed, and suicidal on the survey just to mess with researchers. Several studies, letters, and essays later, the conflict has only grown in its intensity.
“It did kind of become a civil war,” Cimpian says. “People really dug their heels in.”
By all accounts, Ritch Savin-Williams fired the first shot. A developmental psychologist at Cornell University, Savin-Williams was among the first to publicly call the LGBTQ Add Health data into question with an essay published in 2014. “A large percentage of kids who said they were gay or bisexual later turned out to be straight,” Savin-Williams says. “For me, that’s a huge red flag. I didn’t think that would be so controversial.” Besides, even before Savin-Williams sounded the alarm, it was known that the Add Health figures were not exactly paradigms of clean data. “There are some very strange things that anyone who has ever worked on the Add Health data set has noticed,” he says. “Any time you give an anonymous survey to kids there are going to be a few who want to play ‘screw the researcher’. I don’t have limbs, I have six children at 16. Most datasets delete these data. For some reason, Add Health didn’t do that. They left them in there.”
In his essay, Savin-Williams noted that 70 percent adolescents of who identified as gay in the first wave of the study (when they were between the ages of 7 and 12) identified as straight by the fourth wave of the study (when they were in their mid-20s). He concluded that it was unlikely that 70 percent of gay preteens crawled back into the closet as young adults, and that it was logical to assume most of them were mischievous responders who had lied about being gay.
But it’s likely Savin-Williams’ conclusion would not have caused much fanfare had he not editorialized a bit, suggesting that LGBTQ youth may not actually be at higher risk than straight youth. “Maybe I never should have put it in there,” he says. “To suggest the possibility that gay youth weren’t so traumatized after all, if we delete all the kids who are jokesters, was a direct affront to some of the most important publications using Add Health data.”
Indeed, authors of those very publications rallied against Savin-Williams. Fiery comments rolled into the journal Archives of Sexual Behavior, where Savin-Williams had published his essay, and psychologists and social scientists derided his work as all but stealing essential services from LGBT youth. Most of the back-and-forth, however, was conjecture—nobody had solid figures on how many mischievous responders were in the data set, and nobody could truly demonstrate that honoring or discarding the data would have an effect on LGBT kids. So the civil war faded into the background.
Until a study published in May 2017 rekindled the debate. Stephen T. Russell, coauthor of one of the studies directly affected by Savin-Williams’ work, teamed up with University of Texas postdoc Jessica N. Fish to settle the question of whether trolls had ruined the Add Health data, once and for all. Using techniques partially developed by Cimpian to weed out mischievous responders, Russell and Fish concluded that barely two percent of LGBT respondents were likely jokesters. (“They don’t do exactly what I would have done if I were checking things out,” Cimpian says of the study. “But it’s good they do some of it.”). Russell and Fish then demonstrated that, even when these trolls were removed from the data set, LGBT youth were still at higher risk for depression, suicidal ideation, alcohol use, and cocaine use.
This was a key victory. Savin-William’s work had “really undermined the integrity of a data set that had built a foundation for our understanding of LGBT health disparities,” Fish says. “Being able to go back and wade through the data and show that, even when we account for these types of response we still see those disparities, is really powerful.”
Based on the findings, Russell argues that mischievous responders are far less of a problem than Savin-Williams made them out to be. “It’s the essential thing that keeps us up at night—what’s the truth of our data?” he says. “As a culture we don’t really trust young people, so it’s not a stretch for us to jump to the conclusion that the youth are trying to punk us. But we consistently find between 1.5 and two percent of youth are jokesters, and that youth pretty much tell you the truth.” This study should be the final word on whether LGBT teens are at higher risk, Russell adds. “Mischievous responders are an issue,” he says. “But that should not be used to discredit the body of knowledge about LGBT kids.”
Savin-Williams, however, says the study falls short of addressing his main concerns with the Add Health data. Although some of his issues with the work are technical, his overarching complaint is that Russell and Fish merely demonstrated that LGBT kids are no more likely than straight kids to give mischievous responses, something Savin-Williams says he never had any reason to dispute. His 2014 essay claimed that it wasn’t LGBT kids making up answers—it was straight kids, pretending to be gay (and deaf, limbless amputees with gang affiliations). “We’re like two ships passing in the night,” he says. “Their findings have no relevance for our paper, and they failed to address what we were saying.”
But the debate runs deeper than spurious data, mischievous responders, and a few strongly-worded essays. Politics and funding raise the stakes. Russell, Fish, and those who defend the Add Health data claim that the best way to allocate resources for LGBT youth is to demonstrate empirically that this population is in dire need. Anyone who suggests that the data is flawed, then, is indirectly suggesting that LGBT youth need fewer resources. For professionals who dedicate their careers to helping at-risk teens, this is quibbling over statistics at the cost of saving lives. “Let’s stop arguing about whether or not there is a risk,” Russell says. “And focus our attention on preventing negative outcomes for kids.”
This argument, however, is more practical than scientific. The notion that it’s time to set the numbers aside and roll up our sleeves tends to infuriate scholars for whom data is sacrosanct. “They come down on the side of ‘let’s stop worrying about data validity and focus on the disparities.’ I don’t see how those two things can be disentangled,” Cimpian says. “If you can’t trust the data, why would you trust that there’s a disparity? It should be about trying to uncover the truth.”
“Their conclusion kind of sweeps things under the rug,” he adds. “It’s very frustrating.”
For Savin-Williams, Cimpian, and others who question whether teen trolls have marred the Add Health, the battle is likely not over. “I was a bit unhappy Steven [Russell] went ahead and did this,” Savin-Williams says. “A big part of me went, ‘Oh jeez, do I have to go back to look at this again?’” But now, three years after the initial volley of shots were fired in both directions, at least some of the passion behind the debate has died down. While both Savin-Williams and Russell say they have no intention of giving ground, they both implied the debate was less personal than professional. “I like Steven a great deal, and I think there’s evidence he likes me,” Savin-Williams says. “We’re not mortal enemies.
Because at its core, the social science civil war is less about Add Health and LGBT youth than it is about how to best study adolescents, and how to best serve their needs while remaining faithful to the numbers. Social scientists are often placed in the odd position of both studying a community and advocating for it, which means they regularly find themselves pinned between activists and scientists. In one corner stands the activist, desperate to save lives and protect LGBT youth. In the other, the scientist hovers protectively over the evidence-base. Russell, Fish, Cimpian, and Savin-Williams have all dedicated their careers to both helping at-risk kids and defending data, which means they inevitably occupy an awkward no-man’s land, somewhere in the middle. “He should use his way and I will use my way,” Savin-Williams says. “Let’s hope we end up with the same outcome—healthier gay youth. We both want that.”
But the more serious concern—that kids will be kids, and that science will be the worse for it—is is not going away. Cimpian’s work continues to help identify mischievous responders in existing youth datasets, and others have proposed in-person interviews to help weed out childhood trickery (at the cost, of course, of anonymity). And yet, thanks to teen science trolls, it’s still hard to know what data we can trust. Which means it’s hard to know how to do right by kids.
“It’s very troublesome,” Cimpian says. “It boggles the mind that we aren’t more concerned about this problem.”