挖掘社交网络[编辑]
内容简介
Facebook、Twitter和Linkedln产生了大量的宝贵的社交数据,但是你怎样才能找出谁通过社交媒介进行联系?他们在讨论些什么?或者他们在哪儿?《挖掘社交网络(影印版)》这本简洁而且具有操作性的书将为你展示如何回答这些甚至更多的问题。你将学到如何组合社交网络数据、分析技术,如何通过可视化帮助你找到你一直在社交世界中寻找的内容,以及那些你都不知道存在的有用信息。每个独立章节介绍了在社交网络的不同领域挖掘数据的技术,这些领域包括博客和电子邮件。你所需要具备的就是一定的编程经验和学习基本的python工具的意愿。
作者简介
Matthew A.Russell,Digital Reasoning Systems的工程副总裁和Zaffra的负责人,是热爱数据挖掘、开源和网络应用技术的计算机科学家。他是《Dojo:The Definitive Guide》(O'Reilly出版)的作者。
目录
Preface
1. Introduction: Hacking on Twitter Data Installing Python Development Tools Collecting and Manipulating Twitter Data Tinkering with Twitter's API Frequency Analysis and Lexical Diversity Visualizing Tweet Graphs Synthesis: Visualizing Retweets with Protovis Closing Remarks
2. Microformats: Semantic Markup and Common Sense Collide XFN and Friends Exploring Social Connections with XFN A Breadth-First Crawl of XFN Data Geocoordinates: A Common Thread for Just About Anything Wikipedia Articles + Google Maps = Road Trip? Slicing and Dicing Recipes (for the Health of It) Collecting Restaurant Reviews Summary
3. Mailboxes: Oldies but Goodies mbox: The Quick and Dirty on Unix Mailboxes mbox + CouchDB = Relaxed Email Analysis Bulk Loading Documents into CouchDB Sensible Sorting Map/Reduce-Inspired Frequency Analysis Sorting Documents by Value cotichdb-lucene: Full-Text Indexing and More Threading Together Conversations Look Who's Talking Visualizing Mail "Events" with SIMILE Timeline Analyzing Your Own Mail Data The Graph Your (Gmail) Inbox Chrome Extension Closing Remarks
4. Twitter: Friends, Followers, and Setwise Operations RESTful and OAuth-Cladded APIs No, You Can't Have My Password A Lean, Mean Data-Collecting Machine A Very Brief Refactor Interlude Redis: A Data Structures Server Elementary Set Operations Souping Up the Machine with Basic Friend/Follower Metrics Calculating Similarity by Computing Common Friends and Followers Measuring Influence Constructing Friendship Graphs Clique Detection and Analysis The Infochimps "Strong Links" API Interactive 3D.Graph Visualization Summary
5. Twitter: The Tweet, the Whole Tweet, and Nothing but the Tweet Pen : Sword :: Tweet : Machine Gun (?!?) Analyzing Tweets (One Entity at a Time) Tapping (Tim's) Tweets Who Does Tim Retweet Most Often? What's Tim's Influence? How Many of Tim's Tweets Contain Hashtags? Juxtaposing Latent Social Networks (or #JustinBieber Versus#TeaParty) What Entities Co-Occur Most Often with #JustinBieber and#TeaParty Tweets? On Average, Do #JustinBieber or #TeaParty Tweets Have More Hashtags? Which Gets Retweeted More Often: #JustinBieber or #TeaParty? How Much Overlap Exists Between the Entities of #TeaParty and #JustinBieber Tweets? Visualizing Tons of Tweets Visualizing Tweets with Tricked-Out Tag Clouds Visualizing Community Structures in Twitter Search Results Closing Remarks
6. Linkedln: Clustering Your Professional Network for Fun (andProfit?) Motivation for Clustering Clustering Contacts by Job Title Standardizing and Counting Job Titles Common Similarity Metrics for Clustering A Greedy Approach to Clustering Hierarchical and k-Means Clustering Fetching Extended Profile Information Geographically Clustering Your Network Mapping Your Professional Network with Google Earth Mapping Your Professional Network with Dorling Cartograms Closing Remarks
7. Google Buzz: TF-IDF, Cosine Similarity, and Collocations Buzz = Twitter + Blogs (???) Data Hacking with NLTK Text Mining Fundamentals A Whiz-Bang Introduction tO TF-IDF Querying Buzz Data with TF-IDF Finding Similar Documents The Theory Behind Vector Space Models and Cosine Similarity Clustering Posts with Cosine Similarity Visualizing Similarity with Graph Visualizations Buzzing on Bigrams How the Collocation Sausage Is Made: Contingency Tables andScoring Functions Tapping into Your Gmail Accessing Gmail with OAuth Fetching and Parsing Email Messages Before You Go Off and Try to Build a Search Engine... Closing Remarks
8. Blogs et al.: Natural Language Processing (and Beyond) NLP: A Pareto-Like Introduction Syntax and Semantics A Brief Thought Exercise A Typical NLP Pipeline with NLTK Sentence Detection in Blogs with NLTK Summarizing Documents Analysis of Luhn's Summarization Algorithm Entity-Centric Analysis: A Deeper Understanding of the Data Quality of Analytics Closing Remarks
9. Facebook:TheAll-in-OneWonder Tapping into Your Social Network Data From Zero to Access Token in Under
10 Minutes Facebook's Query APIs Visualizing Facebook Data Visualizing Your Entire Social Network Visualizing Mutual Friendships Within Groups Where Have My Friends All Gone? (A Data-Driven Game) Visualizing Wall Data As a (Rotating) Tag Cloud Closing Remarks10. The Semantic Web: A Cocktail Discussion An Evolutionary Revolution? Man Cannot Live on Facts Alone Open-World Versus Closed-World Assumptions Inferencing About an Open World with FuXi HopeIndex
网络营销词典内容均由网友提供,仅供参考。如发现词条内容有问题,请发邮件至info # wm23.com。