I’m in a class where we are learning R. Here’s my second week’s assignment. – The number of users with more than 10000 followers. What percentage of the total number of users is this?
tonsoffollowers[tonsoffollowers$followers_count 10000,]
>>>
2322
1.3%
- The user’s name and screen name with the most followers
max(tonsoffollowers$followers_count)
tonsoffollowers[tonsoffollowers$followers_count==2562105,]
>>>
Rev Run
RevRunWisdom
2562105 followers
- The number of users who are following 0 people (the friends_count variable).
users[users$followers_count==0,]
>>
7110
- The percentage of accounts that are protected.
users[users$protected=="true",]/nrow(users)
>>>
1.6% (28,176)
- The number of people who have more followers than friends.
users[users$followers_count>users$friends_count,]
>>>
75258
- The number of people who have exactly the same number of followers as friends.
6367
- The user’s name and screen name who is listed the most times.
max(users$listed)
>>>
33636
which(users$listed==33636)
>>>
Miley Ray Cyrus
MileyCyrus
listed 33636
2. Because I can’t get away from those baby names, create a plot of the counts of the first letters of Twitter user names. Let’s make a crazy assumption that the average Twitter user’s age is 22, so compare this plot with the first letters of baby names from 2011 – 22 = 1989. Write a couple of sentences about the differences / similarities you see. HINT: You should use tolower() at some point, and you’ll want to strip out the Unicode characters that don’t correspond to a-z. R has a handy built-in vector called “letters” that you can use to easily strip out the first letters you have that are not in az.
gsub("[^a-zA-Z]", "", x)
first.letter <- substr(gsub("[^a-zA-Z]", "",tolower(users$name)),1,1)
(I realize I didn't answer this here... lost my history and need to redo)
3. Let’s look at how the number of tweets you have and the number of followers you have relate. Plot the log10() of followers_count vs. log10() of statuses_count with more appropriate x and y axis labels and a nice title. Color the points based on whether the account is protected or not. Do you see any relationship between the followers count and statuses count? What can we learn about the nature of protected account from this graph?
users <- read.csv("users.csv", h=T, stringsAsFactors=FALSE)
users.protected <- users[users$protected=="true",]
users.notprotected <- users[users$protected=="false",]
#
par(bg = "gray15")
par(fg = "gray30")
par(col.lab = "gray30")
par(col.main = "gray50")
plot_colors <- c("#FF000005","#0000FF05")
legend_colors <- c("#FF000090","#0000FF90")
plot(log10(users.notprotected$followers_count), log10(users.notprotected$statuses_count), pch=16, bty="n", xlab="Number of Followers (log10)", ylab="Number of Tweets (log10)", main="Twitter Users", col=plot_colors[2])
points(log10(users.protected$followers_count), log10(users.protected$statuses_count), pch=16, col=plot_colors[1])
legend("topleft", c("Protected","Not Protected"), cex=0.6, bty="n", pch=16, col=legend_colors);
Interesting resources I found while doing this:
Hi there. I'm a design & code creative living, working and studying in sunny Brooklyn, NY. I'm currently exploring data representation within the context of the networked urban environment as well as the DIY health and biohacking movements.
Keywords: design, user experience, interaction, visual communication, Processing, data visualization, Android, HTML5, css, Javascript, WebgL, branding, rapid prototyping, Python
2010.09 — 2012.05 (expected)
Master of Professional Studies
Interactive Telecommunication Program (ITP)
Tisch School of the Arts, New York University
2010.09 — 2004.05
BA Visual Communications with minor in Art History
The George Washington University
Graduated Cum Laude
National Society of Collegiate Scholars
Spring 2003 semester at Sydney University, AU
2011.06 — 2011.09
UX Design Intern, Microsoft Bing, Bellevue, WA
Worked with design, editorial, dev and program management teams to scope, design and develop prototypes for soon-to-be-released Bing.com feature. The internship culminated in two presentations of the feature prototypes to senior leadership at Microsoft as well as the Bing design team.
2007.02 — 2010.08
Graphic & Interaction Designer, Empax, Inc., New York, NY
Worked with design, editorial, dev and program management teams to scope, design and develop prototypes for soon-to-be-released Bing.com feature. The internship culminated in two presentations of the feature prototypes to senior leadership at Microsoft as well as the Bing design team.
2006.12 — present
Freelance Graphic & Interaction Design Consultant, New York, NY
Worked as a sole proprietor with various clients from retail, music, film, nonprofit, real estate and technology industries to create and improve existing brand and user experiences across many platforms and media.
2004.04 — 2006.01
Graphic Designer, The George Washington University Communication & Creative Services, Washington, DC
Worked with project management and external production vendors to deliver a range of print and interactive material related to university publications and communications initiatives. responsibilities included design and implementation of print collateral, posters, animation, environmental signage, web publication and press checks.
2011.07
Freakonomics (Web),
“What Would it Be Like to Climb 26 Years of Federal Spending?”
2011.04
Flowingdata (Web),
“Physically climb over budget data with Kinect”, by Nathan Yau
2011.02
Logo Lounge 6 (Book),
by Catharine Fishel and Bill Gardner, Rockport Publishers - Gedenk Logo
2010.12
“A Bartender That Pours The Perfect Shot, Every Shot”, by Matt Buchanan
2009.11
Basic Logos (Book),
by Index Book - The 2007 Gotham Awards Logo
2008.10
Print Magazine,
“Dialogue: Martin Kace”, by Steven Heller - The Alliance for Climate Protection Website
2010.12
ITP Winter show 2010, NYC
2011.04
Data Viz Challenge Party, hosted by Eyebeam and Google, NYC
2011.05
ITP Spring Show 2011, NYC
2006.01 — 2006.12
English Teacher, NOVA Japan, Kure-shi, Hiroshima-ken, Japan
Taught and mentored students of all ages and abilities in small to medium-sized classes to improve proficiency in english linguistics and conversation.