The Anatomy of Online Technical Communities

How tech communities are born and killed online

I'd like to discuss a topic I have spent many years researching and learning about that really fascinates me. The makeup of online communities. Specifically, online technical communities. This is perhaps the least technical topic I've written about on this blog, but one I believe is very important. The Internet has grown to over 3 billion users. Today 1 billion web sites on the web are all competing to grab our attention. We're sending out hundreds of billions of emails every day. Interaction online has never been more vibrant.

What I've spent a lot of time trying to understand over the years is what makes up these communities and why do some of them grow to be so large while others quickly dissipate or wither away? It turns out there are some really interesting insights to this if you dig deep enough.

What's really funny to me is that some people think the existence of an online community is there specifically to serve their own interests, without regard to what the community states it has set out to do or what rules or guidelines it deems applicable to any of its members at large. Yet, many people participating in online communities today don't see this as unusual behavior. They believe it's perfectly normal to participate in ways that can only serve their own self-interest.

Perhaps it is because they feel, they have access to the Internet and the community is itself accessible from the Internet, that this gives them free reign and privilege over the community. Or perhaps they are oblivious to their effect on other members of the community. Or perhaps this is just a byproduct of the Internet working so well to bring people together that can't even tell the difference between their own interests and those of the community at large. Whatever the case may be there's no doubt that the Internet has caused a lot of ruckus in how we form and shape our technical communities today.

The Online Community Stack

I've found that just about every online community that has been formed, in the last 20 years or so, seems to share 5 common traits. I call this the online community stack, because well I'm a geek and I'm focusing on technical communities. However, these same five traits seem to be evident not only in technical communities online, but just about any community that has an online presence.

What's even more interesting is that the data supports the notion that the traits found at the top of this stack seem to always resonate the highest degree of variance among members of the community, while the lowest levels of the stack generally exhibit a low level of variance.

For example, every community has a place where its members will gather. Whether that's a mailing list, a forum, a chat room, or some web site. Though the number of people that tend to agree on exactly where a community should gather online seems to vary widely. This is especially evident in the earliest stages of forming the community. There's rarely been an open source project started in the 90s that didn't begin to gather through a mailing list. Yet despite the changes in technology over the years they either still maintain some form of mailing list or online forum in conjunction with newer places to gather like Github, twitter, or facebook, for example. Which is interesting, because if we tend to disagree so much about the where, why do we not care enough to make radical changes? Perhaps this supports the notion that these things aren't so critical to the community after all, but only appear critical to the individual.

People in those communities also tend to disagree about things like ways in which people should be allowed to participate and what name should envelope the community. This is what creates a sense of identity in a community. It's what is broadcast to the world and lets other people know we are here! – so to speak.

However, there are certain things almost everyone in the community tend to agree on. These are what I call the low-variance parts of the stack. Very seldom would anyone, willing to participate in an online technical community, disagree with the shared interests or values of that community. In a technical community, specifically, these tend to center around a common problem that people want to solve and a certain way of solving it that mostly reflects their collective values. When it does happen, that someone disagrees with the shared interests or values, the community tends to band together to stop it from progressing. A change in the very thing that brought the community together is a tear in the community's true identity. Even if we don't agree on exactly where to gather or what to call ourselves, we can usually agree on what we stand for.

Participation is not Contribution

There's always the new guy or gal that walks into a community for the first time and asks a question or makes a statement that infuriates the lot. Sometimes we refer to this as trolling, but it's not always the case. Sometimes that person is just desperately seeking the approval of the community through participation. They believe that if they just participate in some fashion, whether that's asking a question or posting a comment to the mailing list, they will somehow have contributed a piece of themselves to the community and that should lead to their acceptance into the community.

The intention is good, but sometimes the tone is pretty bad. For example, someone on the PHP mailing list complaining that no one will implement their feature request and that this is an awesome thing they are suggesting that everyone should immediately get behind and support – drums up more conflict than support. What's worse is that the community will likely come together against this person as a result of their tone and not necessarily the merits of their idea. This is because what's happening here is simply participation in a community and not a contribution to the community.

The problem with walking into an online community and expecting to be waited on hand-and-foot is that you are yet to be recognized as a part of that community. The community views you as an outsider until you demonstrate that you have submitted to its culture, language, ideologies, and methodologies. When you come in arms-swinging, you are going to automatically turn the entire community against you. You only give them clearer indication that you are not a part of this community. You don't share the same values. You don't subscribe to the same vision. You aren't eager to join in and coalesce, but rather wish to disrupt the natural order of things. This is only good when what you're disrupting is objectively harmful for the members of that community. Not when what you are disrupting is the very thing that banded the community together in the first place.

Though participation takes on many forms in a community. Much of the participation that happens in a community like StackOverflow, for example, actually has nothing to do with the few experts that are answering questions or the ones asking them. The bulk of the activity that occurs in the StackOverflow community actually comes from the passer-bys that are finding these questions and answers through Google. They are the ones that are seeking an answer to the same/similar question they stumbled upon themselves or trying to find a better answer than the one they already know. These lot then potentially join the community and up vote the answers/questions they found useful to them or just take the answer and run. This is a form of participation that doesn't actually yield any valuable contributions back to the community. At least not immediately.

What I've found in most communities that grow large enough is that the majority of participation in the community amounts to a lot of noise and very little signal at first. Once the community ramps up enough people that the shared interests and shared values are well aligned, among a strong enough subset of its members, does the actual signal begin to take hold.

It kind of works like this. At first you have a lot people joining your community and as such the level of participation goes up exponentially. Though the vast majority of that participation usually just amounts to noise with very little contribution of any real measurable value. When you answer a question on StackOverflow and 10 other people also provide an answer to that same question this is a form of participation. It's not actually a contribution that has added value quite yet. Since the question seems to have so many valid answers additional discussion will more than likely ensue and either some people will amend or update their answers while others might even delete their answers altogether in favor of another. In the mean time voting occurs, which is another form of participation, that makes it clearer to the outsiders of that community where value is being identified. It may take just a few minutes for participation of answering a question to occur, but it may take days, weeks, or even months for the resulting contributions of that participation to yield identifiable value.

The key difference between participation and contribution is that the former is macroscopic. It can be seen and identified almost immediately. It's also viewed as the most important facet to the individual members of the community. It's the thing that appeals to them the most. Whereas contributions tend to be a lot more microscopic. They don't present themselves until much later – after a lot of the noise has died down and the participation level has tapered off. For example, how many people you've reached on StackOverflow with your questions or answers, over the years. Or how much reputation you continue to accrue from that contribution over time. How many badges it earns you. Or how many questions you've edited for improvement, and so on… The point is the measure of value from a contribution is much harder to see with the naked eye.

Contribute in a Healthy Way

The trick to creating more valuable contributions in an online technical community is to identify the overlapping areas of shared interests and shared values the community holds so dearly in order to ascertain the social good that that community is creating.

A social good, or common good, is basically the good or service that benefits the largest number of people in the biggest possible way. For example, PHP's social good is providing the world with a way to easily build web sites for free. Anyone in the world can begin building a web site with PHP for free, on any platform, with access to immense support and free resources. To me that definitely counts as benefiting the largest number of people in the biggest possible way. Remember that Wikipedia, the world's largest encyclopedia, was built on PHP. The fact that you can take something that has such an immense positive impact on the world and use it to create more social goods, is what makes a social good—great!

But to find that tiny intersection where shared interests and shared values align so well that they leave enough room for social goods to grow, and flourish over time, is a rather challenging task.

Not all online communities produce a social good. In fact, what some of them produce is actually quite harmful. Take just about any online dating community that is built on the premise of casual intimate encounters, for example. Oh wait… we're talking about technical online communities! Take just about any online technical community that is built on the premise of casual technical encounters, for example (i.e. forums). It's evident that the design of most forum software today is quite horrible.

Everything about a forum speaks to the self-interest of the individual members of those communities that gather in a forum setting. The number of posts you make to a forum is some indication of the value you bring to that forum. As such many forum moderators get promoted based solely on this metric. The fact that you can hijack anyone's thread with the only recourse being the intervention of a moderator is yet another indication of individual members acting purely out of self-interest, and not the shared interests of the community.

When Jeff Atwood, co-founder of StackOverflow, talked about his latest project Discourse, at the Stanford HCI Seminar a few years ago, I listened very carefully. Because I agree that the state of online discussion forums today, around which many technical online communities have formed, is rather displeasing. When people are gathered solely around a shared interest what you have is a crowd not a community. Without a shared set of values to align these interests you cannot build communities that yield value based contributions. Jeff talks quite a bit about this contrast of how StackOverflow was an engine for no, while Discourse was an engine for yes. But what's really interesting is that they both appeal to a set of common shared values that the members of those communities hold near and dear to their hearts. These are the things around which the community resoundingly gives the no or yes.

No, you cannot ask an opinion-based question on StackOverflow, because we value objective ways to determine the correctness of an answer.
No, it is not OK to ask a question if you don't do your research first, because we value giving up our time to help those that are struggling the most after having done their research.
Yes, we will give you an opportunity to correct your behavior if it is seen as misaligned with the interests of the community, because we value growth and self-healing.

These are the values of online communities aligning with the shared interests that the individual members of those communities have in common. These are the things that create cohesion in a community. They're what keeps a community alive long after its founding members have gone. If you want to know whether your community is creating contributions in healthy meaningful ways, just ask yourself if the key members of this community were to exit today, would we still have the same community tomorrow?. If the answer is no, then you don't have a healthy community that is producing valuable contributions. What you have is a crowd gathered around a few popular individuals.

Communities vs Crowds

When you consider how an online tech community like PHP-FIG started, for example, you find a very interesting back story buried deep under piles of online interaction. It began as the PEAR group where certain members of the PHP community wanted to gather and form standards and build a repository of code for everyone to share. If you've never heard of PEAR, that's OK. It's a ghost town now. The problem there was that PEAR was never a community. It was a crowd that gathered under shared interest. The efforts to create a set of shared values that the crowd could align to were quite futile.

This is probably because the crowd was too eager to focus on the highest variance parts of the online community stack, like what to call themselves, where to gather, and how participation of the various members of the group should be conducted. From this you get the bike shedding effect where the greatest amount of time is spent focusing on the least vital issues.

Later attempts at this began to finally reveal the true fatal flaw in the crowd's thinking. Soon the PHP-FIG began producing something. A contribution of measurable value. It was called PSR0. PSR0 had a surprising effect. It shifted the focus from thinking about high variance parts of the online community stack to thinking about the low variance parts of that stack. What the developers of PHP frameworks valued the most at that time was being able to share better code contributions. Code that was more useful, and interoperable. Though these frameworks were open source projects where everyone had to contribute. Not everyone was aligned on how contributions should be made or what is the best way to determine which contribution is more valuable than another. PSR0 made it possible for everyone to agree on which side of the road they should drive. Once this happened people almost immediately began to recognize why values needed to be aligned well with interests and it turned into an actual community.

It's only at that tiny triple intersection of participation, shared interest, and shared value that we begin to identify the value of a contribution in a community. The really difficult part isn't finding people that share your interests. It isn't even finding people that share your values. It's finding people that are willing to participate in ways that make it easy for all of these things to align together very closely such that participation ultimately keeps more valuable contributions around than it discards.

The Store Front Dilemma

When I tried observing how new tech communities are forming online through the web and on IRC, I found that the majority of these same problems that happened with PHP-FIG can be observed in newly forming tech communities online every day. I call this the store front dilemma.

Most online communities are started by just one or two individuals. The problem is you can't have a community with just one or two people. You need to accumulate a critical mass. You need a crowd! And how do we gather crowds? By creating a confined store (place to gather) and putting up a big shiny sign that says who we are and what we're about (name). Then we open our doors and fill the store with anyone willing to buy the bullshit we're prepared to sell (shared interest). The problem is keeping them around. To do that you need to yield to what they find most appealing to them (ways to participate) as individuals, or otherwise they'll just leave.

Now imagine if you will, this fantasy store, that's got a glowing sign, packed with people, who are all interested in the same thing, all interacting within this confined space. Suddenly, your dilemma is how do we all create some value in here? Assuming you've managed to get past all the mediocrity of naming, creating interesting ways for members to participate, and gathering everyone under one roof and still survived; how do you actually produce something of value?

The reality is what little value you can produce in that situation is typically only seen when gathered under that one roof. This is not how communities typically work though. No community lives under one roof. We usually spread out as we grow. So once you begin breaking down the borders of your imaginary store and you no longer have a glowing sign with a big red arrow that says we are here will your community continue to survive? Will it still produce valuable contributions? Will it remember its shared values and interests well enough not to deviate? Will every member of your community act as your beckoning store front signage calling people to join?

Most of them do not. Either because they were never aligned on shared values in the first place or because when they spread out they drew further away from their values and lost their most valuable contributions. When no one else can see your community's impact or find its center, the members of that community themselves will usually grow bored or frustrated and leave. What's sad is that it's usually the most value-contributing members that tend to go first.

Another example of this can be found on Reddit. A popular online message board that surfaces content its subreddit communities deem interesting enough to share and spread.

One of the biggest problems in the reddit community today is that its maintainers have lost all touch with their community. What you see happening on reddit today isn't a community. It's a powder keg of crowds gathered around immensely growing high variance parts of the online community stack. The tumultuous crowds seem to have confused the administrators of reddit into believing the loudest spoken interests are the guiding values of the community. This is fundamentally flawed thinking in building online communities. Values aren't things we take consensus on. They should be binary. We either believe in them or we don't. If we don't then we don't belong in that community.

The result is that you see multiple stories of people leaving reddit because of these fundamentally flawed decisions that do not result in healthy vibrant communities with which people can clearly identify. There was even a Hacker News thread about this where you can see Jeff Atwood commenting on the unscrupulous behavior of the reddit admins in this matter.

Why didn't they just contact you first and ask you about the behavior, expressing their concern? Did they contact you at all? How did you respond? That's not covered in this article. And it seems crazy to me that they would vote to shadowban you without attempting to talk to you at least a little. Is that what happened? If so, that's nuts.

The truth is reddit was a flawed way of building an online community to begin with. It was the very essence of the store front dilemma getting out of hand. The founders tried to gather crowds around reddit from day one by posting dubious content under fake accounts just to attract more attention to the site. Stirring up emotions and trying to spark up conversation of any kind in the interest of growth. This is never how you build a healthy online community. It's no wonder why some reddit moderators also find it trivial to conduct themselves unscrupulously on reddit boards since reddit itself doesn't really make the shared values clear. It seems to me that whatever the consensus of the day is on reddit is what is considered indicative of the community's shared values. Though that's completely bizarre since agreeing on things doesn't mean we've arrived at the best answer or that it's even a good answer. Just that we happen to have enough heads nodding without dissension. That's a sign of a community whose stack is upside down.

Shaping Behavior

Associate Professor of Computer Science at Stanford University, Jure Leskovec, gave a really interesting talk at EngX a few years ago where he uses data from StackOverflow users to demonstrate how we can modify behavior within online communities through systems of reward.

He points out that on StackOverflow there is an Electorate gold badge, which is awarded to users who have voted on 600 questions or more with at least 25% of their votes being on questions. If you take a typical user behavior and plot it on the Y axis of a two-dimensional graph and then plot the desired modified behavior you'd like to move the user towards, on the X axis then map that behavior over time what you get is something that looks a little like this.

The user of that community seems to only be interested in doing one behavior or participating in one particular way that they enjoy, like answering questions. So you want to modify that behavior slightly to benefit the community in a different way. You offer them an opportunity to get a gold badge by voting on 600 questions. Now the user actually has incentive to participate in a different way. So what ends up happening is that you start to move the user through a vector whereby they're still participating in the community in the way that is important to them, but they're slowly starting to move closer to the other edge of the graph in order to get their gold badge. Over time they eventually start increasing their participation level in the other behavior as they get closer to their goal. Then finally once they get their badge they usually just go back to participating in the way that suits them best.

What's interesting here is that it's been proven to work quite well without any real evil. It promotes valuable contributions in the community in a healthy way that isn't acting entirely out of an individual's self interest, but without forcing the individual members of the community to change the high variance choices of their mode of participation.

How to Build Better Communities

So the question remains how do we build better communities online knowing all this? Well, I've found that the best functioning technical communities online today are the ones that really set out to solve a problem that people actually see as a problem. They didn't try to invent a problem so that they could be heralded for solving one. The best communities are also able to identify with their most active members. They know what healthy participation should look like and what unhealthy participation should look like. They can rely on their most active members to be their flashy signs that draw new members in and depend on their values to keep the entire community together.

Here are a few of my suggestions on how to build better online tech communities.

Have a Code of Conduct

Sometimes just having a CoC lets people know upfront what you stand for and why you're here. Without this it makes it harder for outsiders to identify with your values and know if they belong here or not. A CoC should be used to keep the peace, or resolve conflict in the higher parts of the online community stack (i.e. high variance issues).

Talk to your community regularly

If you're not talking to your community on a regular basis and hearing what they have to say or what they feel is worth discussing then you probably shouldn't be trying to build a community in the first place. You really need to listen very carefully to what a community is saying. My philosophy is generally that I am open to everything but committed to nothing until I fully understand the choices available to me. I believe that good community leaders should respond to their communities in this same way. Because committing early to a decision that will effect the long term direction of your community could be the very thing that derails your entire community from its true values.

Propagate identity persistence

Once you grow out of your niche and the community expands outside of the artificial borders from whence it was just an idea in the back of your mind, you will no longer have complete control over the domain of your community. Your community, if successful, will spread like wild fire throughout every edifice of the Internet and without your explicit consent.

So you must foster the identity of your community, and the values you know will keep it all together, in every member such that they will be your biggest signage. They will be the watchers from beyond the wall. They can speak up about the virtues of your community within other communities and either prevent those who wish to impede on your core values from infesting the community or harness their knowledge of your community to gather more members that share the same core values.

Heal from within

The biggest woes of a growing community is that someone is bound to make a mistake. At some point someone is bound to breach your Code of Conduct or wander outside the boundaries of your communities shared values/interests. When this happens, and not if, you should find ways to allow these people to correct their behaviors before you resort to outright banning them from the community. This is very important, because sometimes banning someone only incites more problems than it solves. I mean just look at reddit as the shining example of causing more conflict than resolution through bans. Take measures that don't leave all the power in the hands of just a few key moderators. Instead, try building tools that can detect bad or frowned-upon behaviors in your community, and autonomously offer up incentives or encouragement for those members to change or modify their behavior. I mean, after all, we are techies with big brains?