CPFAQ: Quora Topic Cleanup


I’m working on a project this year to build a competitive programming FAQ. This is one in a series of articles describing the research, writing, and tool creation process. To read the whole series, see my CPFAQ category page.

As part of the FAQ research process, I’m creating a canonical list of Quora topics related to competitive programming. Like many things on Quora, topics in this area are a bit messy. Or as the Quora Topic Gnomes say:

Quora’s Topics are a free-for-all, and that often creates greatness, but it really requires crowdsourced curation to make it all it can be.

Topic Gnomes

Anyone can create a Quora topic, and many people take the opportunity to do so. This helps keep questions categorized, but it can also lead to duplicate topics (Competitive Coding), misspelled topics (Competitive Progarmming), and topics that are just too narrow to be useful (JANE (SPOJ), a topic about one particular SPOJ problem).

Fortunately, Quora offers a topic merge function to combine related topics. Unfortunately, many popular topics are locked to prevent vandalism, and most users can’t merge a topic into a locked topic. For those topics, the official process is to create a post on the Topic Gnomery blog with the suggested merge. So I’m collecting data for such a post.

Topic Criteria

To create my canonical list, I need to come up with rules that determine which topics go on the list, which stay off the list, and which need to be merged.

For a topic to go on the list, it needs to be sufficiently related to competitive programming. The top-level topic for the list is Competitive Programming itself. Clearly, topics like TopCoder, ACM International Collegiate Programming Contest (ICPC), and C++ in Competitive Programming also belong on the list, since they are closely related. However, a topic like Algorithms in C++ is too general. Although competitive programmers need to know about algorithms, and many of them submit solutions in C++, adding topics like that one to the list would greatly increase its size, and would turn it into a general programming list. There are already plenty of those lists.

One potential grey area is how to treat topics related to coding interviews. Coding interviews and competitive programming aren’t exactly the same thing. But they are so closely related that some coding interview topics belong on the list. For example, HackerRank is a “technical recruiting platform,” but they do their recruiting through programming puzzles and contests. So they need a competitive programming platform to run their recruiting, just as TopCoder needs one to run its contests. As a result, some questions in the HackerRank topic are relevant to a competitive programming FAQ.

Merging Criteria

The topic gnomes will want to hear reasons why particular topics should be merged. When deciding whether to merge two topics, some considerations are topic overlap, topic activity level, and topic philosophy:

  • Topic overlap is how many questions from one topic could also be tagged with the other topic, and vice versa. If many questions could share both topics, maybe there’s no need to have both.
  • Topic activity level is how many questions are tagged with a topic. If a topic doesn’t get much activity, maybe it’s not worth keeping it around. On the other hand, if a topic is frequently used, people may not want it to be merged, even if the merge would otherwise make sense.
  • Topic philosophy is how you think topics should be used. For example, do you like many specific topics with fewer questions per topic, or a small set of more general topics. Since Quora users can create topics at will, and it takes effort to merge topics, I think Quora ends up with too many topics. Hence the need to clean up topics in the competitive programming area before I make my topic list.

(Image credit: John)