Fandom stats and surveys and so on

Sherlockian fandom stats & data chat!
You're all invited to come join an online chat about fandom stats & data this weekend -- and/or to suggest topics or questions!  (We don't have any presentation planned, just a fun freeform chat about collecting and sharing data about the fandom, about interesting stats, and whatever anyone else feels like talking about on this general topic.

Come join if you're interested in Sherlock Holmes and/or fandom data; the more the merrier!  Sharing fffinnagain's lovely invite below (Tumblr version):

Data, Data, Data (About Fandom)!

There are various Sherlockians (and other fans) doing cool fandom statistics and fan surveys to learn more about Sherlockian fans and fanworks. Come join destinationtoast and fffinnagain at unlockedcon​ for a discussion about fandom stats, surveys, and data of all sorts. Fandom data geeks are welcome, but so are all other fans—no stats background required, just curiosity!

The empirical fun starts at 7:00 pm (UTC) on Saturday April 25th, (convert to your local time.) It’s just a week away, but RSVPs and suggested topics are totally welcome!

AO3 vs FFNet, or how our fanworks accumulate

How many works are posted per day to our favourite* archives?
As far as I know, neither AO3 nor FFnet actually track these numbers, and at the very least, they don’t publish them. But being the curious cat that I am, I couldn’t help exploit a detail of their respective structures to figure it out.

Caveats: These plots report the number of works started, whether or not they are still in draft form when the clock strikes midnight in Greenwich. The above squiggly lines report the median of daily number of works numbers assigned per fortnight, which helps smooth away some of the mess. A lot more data massaging went into getting these time series, and if you are curious about the nitty gritty, click through the read more.

But anyway, what can this graph of posting rates for the last four-ish years tell us?

1. Archive of Our Own has been growing like crazy. The daily posting rate these days more than 10 times that of mid 2010, from 150 to nearly 2000 by the end of 2014. So congrats to the OTW volunteers for keeping up!
2. is the bigger archive and continues to accumulate works faster than AO3, but it has been slowing down in the last couple of years. If these trends continue, AO3 will over take FFNet in the next year.
3. Both archives should big spikes in activity around the New Year, but their activity through the rest of the year shows some differences.

So here is a plot of daily posting rates relative to the annual average for each archive.
1. Both archives have been most activity from the very end of December to the beginning of February, and then AO3 drops in March.
2. Both archives rise again through April but FanFiction.Net has a sustained period of relative more posting from the beginning of June to the end of August. This is probably from all the (North American? European? Northern Hemisphere?) High School students out of school for the summer.
3. AO3 doesn’t show the same increase in activity in that period, from which I’d argue that a smaller portion of AO3 users are of high school age. Still AO3 also loses some momentum through September until the big surge in late December.

Any other suggestions on what annual events might explain these bumps are welcome!

I smoothed out all the weekly fluxuations in the above time series, but there are patterns at this time scale as well. The bar graph below shows the relative rate at which works are posted per day of the week (or rather noonish of one day to noonish of the next). On AO3, Tuesday is are slowest (night) and Sunday our most active. On FFNet, Thursday is the slowest and Saturday to Sunday the most active. That Friday picks up on FFNet but is relatively low on AO3 suggests again an older crowd using the later, who may well be going out instead of writing fanfiction all night.

A last note: I really want to do the same kind of estimation for Wattpad, but their number system makes no sense to me (yet). One day I might crack it, but it would take a lot more work to get comparable data.

For the details on where these numbers come from: Read more...Collapse )

Tools tools tools!
Hi all!

I've been fantasizing about digging into some fanstats (haha) but it's been hard getting started, trying to figure out what tools are already built and what tools aren't. Could we do a quick tools share, broken-down by categories (e.g. what API you use to interface with ao3, what analysis framework you found useful), with appropriate github repo links?

Maybe at the end we can make a summary list and have it around as a resource for everybody who wants to get started but can't find a way in!

Discussion: Fandom stats media coverage
Last week, io9 ran an article about my apocalyptic fiction stats.  I had slightly mixed feelings about the experience -- on the one hand, they very kindly reached out to me ahead of time to ask my preferences about how I wanted to be cited, and if I had any other comments (which, e.g., The Daily Dot has not done when they've covered my work).  OTOH, their article felt like it was just my post paraphrased.  Which brought me a lot of new followers, but I wasn't the one getting paid to write that article, even though I created the content.

Also in the past week, Vulture ran a big guide to fanfiction which had a bunch of stats about fandom -- but almost exclusively about Wattpad.  They did cite one AO3 stat that I think they must have gotten from me, but it was from 2013 and on a topic I've explored in far more detail since then.

I'm not actually grumpy about any of these events, but it got me thinking.  What should be best media practices when talking about fandom, especially when mentioning anything quantitative?  And should they talk to the people who did the stats, or just make use of whatever data is publicly available?  Are there things we should do differently when we publish analyses, in case journalists want to use them?  (It does make me feel like I should to be even more careful in the future about carefully explaining my methods and analyses and providing caveats about any conclusions.)

FYI -- new stats: languages on AO3 and FFN
I'm trying to get better about posting multiple shorter analyses rather than just enormously long posts -- I get the sense people are less overwhelmed by 1-3 graphs than by pages of analysis.  So even though I'm in the midst of a lot of different comparisons of AO3 and FFN, I just posted one small piece.  Thoughts/feedback welcome if you have them, as always.
Read more...Collapse )

Interesting course in research methods and statistics
So, I don't know if anyone else around here is enough of a novice at all this to find it attractive, but I happened across this course:, which is offered online by the University of Amsterdam, and covers the following topics:

How to interpret and evaluate quantitative and qualitative research
How to interpret and use descriptive and inferential statistics
How to separate solid science from sloppy science, based on methodological and statistical grounds
How to design, conduct, analyze and report your own research in a capstone project

The next series of courses (there are 4 in the specialisation) starts in August, so there's plenty of time to decide if anyone's into it. It says it's about 4-8 hours work per week.

It seems like the kind of thing that would be awesome to do with other people who were interested in similar research, as it culminates in a research project conducted with a team.

Anyway -- I thought I'd throw it out there, in case anyone's into it. I mean, I have a degree in literature, so... I am a total layman here. If anyone else is interested, let me know!

Discussion: Apocalypse, media stats, and Wikipedia as data source
Just threw together some quick apocalypse stats based on Wikipedia (not directly about fandom, but about media):

I kept it short and quick, but feedback/thoughts about the graphs still welcome.

I'm also interested in discussing the pros and cons of studying non-fannish data sources... Wikipedia is mostly edited by a non-demographically diverse set of white dudes, IIRC, but it does have some useful lists like this one.  I've also thought about trying to maybe use data from sources like IMDB, GoodReads, etc. to try to get a sense of how the media landscape has changed over time -- mostly with an eye toward trying to spot how that correlates with changes in fandom, and which kinds of source media tend to make for lively fandoms.  But I'm a bit daunted at the scope of such a project.

Other thoughts, ideas, suggestions of data sources?  Or suggestions for things besides the apocalypse that would make for fun one-off stats topics?  

Fandom stats roundtable discussion on Three Patch Podcast
The most recent episode of the Sherlock fandom podcast, the Three Patch Podcast, includes a roundtable discussion of fandom stats and surveys by a number of names likely to be familiar to members of this community (including me).  It's really awesome and interesting, and mostly not specific to Sherlock fandom.

I finally got a chance to hear it -- fffinnagain (who moderated the conversation and edited the segment), you did an utterly fantastic job!!  Thanks so much for running this!

Listen (relevant segment starts at 1:11:00) -- show notes are also at the link

There will eventually be a transcript, as well -- I'll try to post it.  Feel free to discuss the episode in the comments.

What affects individual fanwork popularity over time? -- help with/brainstorming for analysis?
Thanks so much to everyone who answered my last post -- I haven't had time to respond to the feedback yet, but it was very helpful.  So I thought it might be good to get feedback early on about another project I've been working on -- looking at what affects individual fanwork popularity over time.

I'm looking for help with data-gathering methodology and/or visualizations, and there are specific questions at the end of the post.  First, though, behind the cut, I'll share the graphs I have so far.

Read more...Collapse )

So, my questions are:

  • Is this (potentially) interesting? What would make it more interesting, especially to people who are not me?

  • How can I generalize this beyond my own fanworks? I only have my own kudos emails.  It has occurred to me that public bookmarks do have dates attached, so we could potentially scrape the dates of bookmarks for a large number of works and get aggregate stats.  The good thing about my own personal stats, though, is that (a) there are a lot more kudos than bookmarks so the data will be sparse, and (b) I have a sense of why a bunch of the peaks happened for my own fics, but not usually for other people's. Edit: and (c) as soon as I start playing with other people's data I have to worry about privacy, anonymization, etc.

  • How should I improve my visualizations? Can I make the above graphs clearer/more interesting (aside from the event captions I said I want to add)?  Are there other visualizations of the above data I should do, or related data that would be useful to visualize instead/as well?

Thanks for any feedback!  Like I said, it was super helpful with the last post I did -- and I actually think this data set is potentially more interesting to writers than the last set, but I'd love your thoughts.

Help with a fandom stats analysis about het vs. slash vs. other?
I have a statistical analysis that I tried to do that just doesn't really have any big conclusions and seems a bit confusing/meandering.  I'd love advice if anyone has any idea how to improve this (could involve doing more/different analyses, or just explaining things differently).
Read more...Collapse )


Log in

No account? Create an account