melannen: Commander Valentine of Alpha Squad Seven, a red-haired female Nick Fury in space, smoking contemplatively (Default)
melannen ([personal profile] melannen) wrote2014-05-27 06:16 pm

How to graph the spread of a Tumblr post

A while back, I posted a graph on Tumblr illustrating the spread of my most popular post ever via reblogs (it's currently at almost 4500 notes, which is just weird. why, tumblr).

Several people have asked for instructions on how to do the reblog graph, so here they are. I realize this is a post that's all about tumblr but I'm posting on DW. That's because while Tumblr is *great* for virality, I still refuse to use it for long or complicated text posts, at which it is terrible.

This isn't necessarily the best way, but it's the way I can do it at work, with no expertise, without downloading any software. (If you want to play with really powerful network graphs, download the open source package Gephi - as a bonus its sample data set is a network chart of Les Miserables characters!) But Gephi is hard. Here's how to do it with just Notepad, Paint, and Google.

As my example, I decided to use a post from someone who is authentically popular on the internet, unlike me - friend and artist Kevin Bolk of Interrobang Studios. Specifically, his Sexy Astrophysicists Art Card Set (As of a mid-April 2014 when I started writing this post). I've made the Google Fusion link public if you want to play without starting from scratch: kbonetwork.

Here's how to start from scratch:

Step I: Acquire and clean the data


1. Go to the tumblr post's page. Click "Show More Notes" until you have loaded them ALL. (If you are running tumblr extensions that add extra info here, either turn them off or use a browser without the extension installed.)
2. Copy & paste the list of notes into Notepad or other basic text editor. (Make sure it doesn't paste the icons.)
3. Use find/replace to run the following, keeping the leading & trailing spaces exactly as I have them (but not including the quotation marks):
a) Replace " reblogged this from " with "&"
b) Replace " likes this" with "& likes"
4. Use find to search on the following and remove extra info:
a) Find "added", and remove extra information after the reblogee's name (including extra lines.)
b) Find "said", and remove the lines that are just comments.
c) Remove the last line, which says "[Person] posted this"
5. Save the text file.

Step II: Make a spreadsheet.


These instructions are for Google spreadsheets, since we're already using Google for the graph. Both LibreOffice and Excel have simpler ways to accomplish more with this data, if you're comfortable with them.
1. Sign in to a Google account. Create a new spreadsheet in Google Drive and then select File -> Import -> Upload and upload the text file you just saved.
2. When the dialog box pops up, select "create new spreadsheet" and Separator Character: Custom: &
3. Press Import. A line will appear at the top that says "File imported successfully. Open now ยป". Click on Open Now.
4. You should see a spreadsheet with two columns, one of the person who left a note, one with either " likes" or the name they reblogged from. Insert a row at the top and label your columns in that row.
5. Click the box in the upper left corner to select all. Then go to Data -> Sort Range
6. Click "Data Has Header Row" and sort by your second column, A-Z. This should sort all your likes to the top.
7. Delete all the likes. Make sure the spreadsheet is saved.

Step III: Load the Data into Fusion Tables


1. Sign in to the same google account.
2. Go to the Fusion Tables app ( http://tables.googlelabs.com ) . Click "Create a fusion table". (Note: Fusion Tables is experimental and may change unpredictably.)
3. On the Import screen, select "google spreadsheets", and pick the spreadsheet you just made.
4. When it has loaded, click "Next". Add any metadata you would like. Click "Finish".

Step IV: Make the graph


1. Click the red + at the end of the tabs row. Select "Add Chart".
2. Make sure the last item in the left sidebar, the network graph, is selected. It should load a graph.
3. Above the graph, it should say [X] of [Y] nodes. Make sure X and Y are equal so that all your nodes are showing.
4. If you'd like to play with the other options, do so (they're currently pretty limited.)
5. When you're done playing, press "Done".
6. Double-check that [X] of [Y] nodes is still showing them all - it may keep trying to reset itself.
7. You can drag-and-drop individual nodes in the graph to make them bounce around! It's not well-behaved but you can play with the shape a little.
8. Zoom with the + and - buttons until you like what you see.

Step V: Share


1. AFAIK, there is no built-in way to make a fusion table into a graphic, so you have to do ti the old-fashioned way by taking a screenshot. Get the table looking the way you want it to look, then (on Windows and most Unix) press the Printscreen key. (on other OSes, look up how to take a screenshot.)
2. Go into a graphics editing program (the built-in Paint-type program will do) and "paste". Your screenshot should appear.
3. Crop it down so it only shows the graph. (if you don't know how to crop, check your graphic's programs help files.)
4. Save it as a .png or .gif file. Upload to Tumblr. Profit.
5. If you'd like people to also be able to play with the version on Google, go back to that file (it should show in your Google Drive account), and select "File" then "Share" in the menu. Change "Who has access" to either Public or Anyone With A Link. You can use the URL on that screen to share the link.

Result:
network graph of sexy astrophysicists

If you try this and have issues, drop a comment to this post, and I'll do my best to answer.
zana16: The Beatles with text "All you need is love" (Default)

[personal profile] zana16 2014-05-28 02:04 am (UTC)(link)
Very cool. Thanks for the tutorial!
ysabetwordsmith: Cartoon of me in Wordsmith persona (Default)

Wow!

[personal profile] ysabetwordsmith 2014-05-28 09:10 am (UTC)(link)
The supersharers really pop right out, don't they?

[personal profile] whatistigerbalm 2014-05-28 11:06 am (UTC)(link)
I was going to do this with my most reblogged thing until I got to your step one; with some 50,000 notes it would take longer than I can be bothered, I think. (And, if this helps your collection of data on tumblr, it was easy to tell what the hubs of activity were: reblogs by frogman, bunnyfood, and getting highlighted by Tumblr staff; those spiked thousands of further notes on their own.)

[personal profile] whatistigerbalm 2014-05-29 12:38 pm (UTC)(link)
I too found that the 100-500 range is the most useful note count; I used to have a fandom blog in Tumblr that, in its rather small and disjointed fandom, was something of a central hub. The most interesting information I got from it was not *where* the reblogs spiked - it was me - but *when*; I learned quite a lot about the best posting/reblogging times for maximum spread, usually aligned (I guess, but I'd bet on it) with young Americans' online times.

[personal profile] whatistigerbalm 2014-05-29 07:56 pm (UTC)(link)
'Fraid not, sadly; I rely on xkit too. It was more a case of comparing the times I posted or reblogged things to the amount of response, and drawing out patterns. Since I was the "supersharer" of the lot it was reasonable enough (and corroborated by the notes) to assume any bursts of intense reblogging were due to being seen by the most people at their time of browsing, and not being reblogged by something with more followers.
isis: (Default)

[personal profile] isis 2014-05-28 05:31 pm (UTC)(link)
This is very cool! Except for the part where it is about Tumblr :-)

The comment above about supersharers brought to mind the chapter of, um, one of Malcolm Gladwell's books (I think it was Outliers? Might have been Tipping Point, though), where he's talking about mavens, and how they are super-connected and important in the spread of trends and ideas.
siegeofangels: The angel from Guido Reni's "The Angel Appearing To St. Jerome" (Default)

[personal profile] siegeofangels 2014-08-06 11:14 pm (UTC)(link)
I tried this and did NOT have issues--your instructions are very clear and easy to follow. Thank you for posting this!
ceruleancat: (Default)

[personal profile] ceruleancat 2017-11-17 01:36 pm (UTC)(link)
I know this is an old post, but still very interesting.
And the graph is beautiful.
Do you know if in the meantime there's a more automated way to do such network graphs?