OpenRefine: Reconcile, Extend, & Publish
Training on OpenRefine continues with instruction from expert Tom Morris. Over two weeks we have learned how to import different kinds of file formats, and explored the tools and techniques for...
View ArticleWeb Scraping 101
This week we continued on our Data Scientist journey by exploring web scraping basics. Our intrepid leader was again Tom Morris, who demonstrated two ways, out of many, to scrape web pages. While there...
View ArticleOpening Government with Python
On October 4th, we were lucky to be joined by James Turk from the Sunlight Foundation, “a nonprofit, nonpartisan organization that uses the power of the Internet to catalyze greater government openness...
View ArticleUsing Web APIs from Python
On October 1, we had our fifth and final session with our able instructor, Tom Morris, for this section of the DST4L course. Tom provided us with an overview of APIs that was succinct but detailed. We...
View ArticleExploratory Data Analysis and Statistics using Pandas and Matplotlib
Today we looked at exploratory data analysis and pandas using matplotlib. After hanging out with Tom Morris for the past five weeks, we switched it up and had a lecture from Rahul Dave. Rahul is a...
View ArticlePandas: Munging, Stats and Visualization
In this past week’s class on Tuesday, October 15th, we went back to our dataset of olive oils and their constituent fatty acids to learn some more about Pandas, NumPy, and MatPlotLib. As an aside, this...
View ArticleStats1: Basics with NumPy, Matplotlib, scikit-learn
Our #DST4L class on October 22nd was a whirlwind tour of probability, statistics, and machine learning in 3 hours. Before diving into that, I want to make sure you all know that the Python programming...
View ArticleNaive Bayes and Friends
In our 9th week of #DST4L, Rahul Dave built on the previous week’s crash course in statistics with a deep dive into machine learning, classifiers, and information retrieval. First, this saw us working...
View ArticleData Vis 101 with Lynn Cherny
The DST4L class of November 4 featured the return of instructor Lyn Cherny, whom we last saw at the end of August when she taught a class on data manipulation and graphing in Excel. Lynn started her...
View ArticleGabriel Florit, Boston Globe Visualizations
On November 18th, Gabriel Florit from the Boston Globe talked about data analysis and exploration and presented a wide variety of his interactive visualizations to an attentive audience at the CfA...
View ArticleTopic Modeling and Gephi
“(11/17/2013) Guided once more by the wise and powerful Dr. Lynn Cherny, MD (Master of Data), we first attempted to discourage cardiac fibrillation by applying Topic Modeling to Grimm’s Fairy Tales,...
View ArticleDST4L Feedback Session
The last regularly scheduled meeting of the DST4L class on November 26th, 2013 was devoted to course feedback (the originally scheduled topic “Putting it all Together: On the Web, PDF, Reporting” was...
View ArticleRegistration now open for Data Scientist Training for Librarians (DST4L)
Data Scientist Training for Librarians or DST4L is an experimental course being offered by the Harvard-Smithsonian Center for Astrophysics John G. Wolbach Library and the Harvard Library to train...
View ArticleData Savvy Librarians Meetup
Mark your calendars: the first Data Savvy Librarians Meetup is coming up soon! Wednesday, August 27 4 pm-6 pm Phillips Auditorium, Harvard-Smithsonian Center for Astrophysics 60 Garden Street,...
View ArticleOpenRefine and Metadata Freedom
“(9/22/2014) The spectacles were perpetrated by wizards of a sort. Magicians of Dada, or ta-da or someodd. As I first understood it, their power was mystical or perhaps equinoctial. All I knew was that...
View ArticleWhy Git? Managing Version Control
A roomful of librarians interspersed with a few technology geeks gathered at the Harvard-Smithsonian Center for Astrophysics Phillips Auditorium last weekend to hear software developer and GitHub...
View ArticleData Digging and Display with Python
“(11/8/2014) The Craftsman worked with speed and precision. He was architect, researcher, tester, and creator. From the translucent python coiled around his arm, he wrought the tools to build his...
View ArticleDesign for Interactive Data Visualization
On December 1st we were lucky to have Lynn Cherny join us as a guest speaker. Lynn’s talk, Design for Interactive Data Visualization (slides & video), started with Shneiderman’s Infovis mantra:...
View ArticleData Visualization Made Easy
Interactive Visualizations: Finding and Telling Stories with Data using Tableau On February 5th Tara Walker taught us how to use Tableau Public (a free service offered by Tableau, also known for its...
View ArticleAssumptions, Visualizations, and the Quantified Self
Friday February 6 at 10 AM, the final session of DST4L’s third round commenced. Jim Davenport had more than enough content to cover during his marathon session. After a brief introduction to his...
View Article
More Pages to Explore .....