I’m working on a project this year to build a competitive programming FAQ. This is one in a series of articles describing the research, writing, and tool creation process. To read the whole series, see my CPFAQ category page.
Every website that gets its content from its users has to deal with the problem of quality control. It’s not enough just to ask people to write words and publish everything you get. With that approach, low-quality garbage easily overwhelms useful writing. The most successful websites in this space have come up with unique ways to ensure that their content is worth reading.
The Quality Control Problem
Good writing isn’t easy. It takes time and effort to write something that’s interesting and understandable. Historically, publishers have played a gatekeeper role, filtering out most submissions. This arrangement has pros and cons. It gives readers a way to find quality writing. But it also filters out some good writers who don’t manage to get past the gatekeepers.
Traditional publishing developed in a world of scarcity. In the online world of unlimited publishing capacity, writers no longer have to go through publishing gatekeepers. However, readers still need a way to decide what to read. They can stick with the online incarnations of traditional publishers, with their professional writers and editors. But because there’s a lot more user-generated writing than professionally produced writing, most people eventually end up on sites like Wikipedia, Stack Overflow, and Quora. Here’s how each of those sites handles the quality control problem.
Wikipedia’s notability criteria require that each article include citations to reliable and independent sources. This helps ensure that the article is more like a research paper than a personal reflection. It also makes it more difficult for writers to use Wikipedia for self-promotion.
Wikipedia editors (i.e., anyone who likes to edit Wikipedia) enforce the notability criteria and countless other rules dealing with everything from writing style to legal issues around biographies of living people. Many articles have followers who get a notification when each change occurs. When a notification arrives, they check the diffs, and make sure the new changes look reasonable. This oversight process powers Wikipedia’s uncanny ability to resist both vandalism and well-intentioned but misguided edits.
Of the three sites discussed in this post, Wikipedia has the strictest quality control rules. Articles on major topics approach the level of quality found in a professionally edited publication, even as the volume of changes surpasses what a professional staff could keep up with.
Last week, Stack Overflow co-founder Jeff Atwood wrote a retrospective article on the occasion of Stack Overflow’s tenth anniversary year. In it, he describes Stack Overflow as “a sort of Wikipedia website for computer programmers to post questions and answers.” He makes several other comparisons to Wikipedia:
- Stack Overflow releases its content under a Creative Commons license, which permits anyone to use it for a variety of purposes.
- Questions and answers should be written for anyone who might encounter a particular programming challenge, not just the person who originally posts the question.
- Multiple questions asking almost the same thing are not allowed.
On the topic of quality control, another Wikipedia comparison is relevant: active Stack Overflow users care about question and answer quality. They spend time not just writing questions and answers, but upvoting, downvoting, flagging, deleting, and debating to maintain quality standards. This gives them a reputation for being unfriendly rule-enforcers, but it has made Stack Overflow the best place to find answers to programming problems.
Stack Overflow doesn’t have Wikipedia’s rule about independent sources. People can ask and answer based on their own experience. And users don’t ruthlessly edit a single article over weeks and months. Once a question and answer serve their intended purpose of solving a well-defined programming problem, people generally move on to the next question. But the community does have strict rules about which questions are allowed, how many answers are too many, and other guidelines that have evolved over the years. This gives Stack Overflow and Stack Exchange the highest quality of any Q&A site or forum.
Unlike Wikipedia and Stack Overflow content, Quora questions and answers vary widely in quality. On one end are thousands of math answers by Alon Amit and similarly industrious contributors. On the other are the countless repetitive and mediocre answers, unmerged duplicate questions, pointless topics, and other digital garbage that would never survive on Stack Exchange.
The Quora corporate philosophy seems to be: collect as much content as possible, regardless of quality. Then use signals like votes and view counts to extract the good content and present it to users on their feeds. But leave the garbage content around for unlucky users to stumble across. This is in contrast to the Stack Exchange approach, in which low-quality questions are deleted so they don’t clutter up the site.
A recent initiative, the Quora Partner Program, doubles down on this approach. Quora pays QPP members based on the views that their questions get. This encourages some of them to post thousands of questions which they surely have no intention of following. And because the machine learning algorithms that generate Quora feeds are far from perfect, some people see these pointless questions, until they mute the offending writers.
Quora is the lowest-quality site of the three profiled here, and this is partly by design. Quora believes that by generating enough volume they can collect acceptable answers for the questions people are searching for, thereby generating traffic and ad revenue. So far, their plan seems to be working reasonably well. Quora.com ranks 95 globally, as compared to 64 for Stack Overflow. and 5 for Wikipedia.
Through some historical turn of events, Quora became the top site for competitive programming questions and answers. So that’s where I do most of my CPFAQ research. My process is to collect questions using my own tools, rather than relying on the Quora feed. Using this raw view of Quora content, I see the top questions and answers and also the long tail of less visible content.
I don’t think Quora’s approach is the best way to generate a useful body of writing on a topic. But it works. And because many CP enthusiasts spend their non-programming time there, it has collected a lot of useful information on the topic.
(Image credit: Ruth Hartnup)