A Unified API Wrapper to Simplify Web Data Collection

Short talk

Go to NumFOCUS academy page.

Data scientists often need to get data from different websites (e.g. Yelp, Twitter, Spotify) via their Web APIs. In this talk, we will present DataPrep.connector, a unified API wrapper in Python. It enables data scientists to get data from different websites using the same programming interface, significantly simplifying web data collection. This talk will be of interest to all data scientists.


Pei Wang

I am a Ph.D. student in Database System Lab(DSL), School of Computing Science, Simon Fraser University, advised by Prof. Jiannan Wang. My research interest lies in data enrichment, data extraction, and data integration related topics. I got my master degree from Nagoya University, where I was a member of the DBlab, and was advised by Prof. Chuan Xiao and Prof. Ishikawa Yoshiharu. And I got my bachelor degree from Xi’an Jiaotong University. I interned at Microsoft Research DMX group and at Amazon Product Graph team. I published papers on top database conferences like SIGMOD.

Weiyuan Wu

Weiyuan is a Ph.D. student from SFU Data Science Research Group (http://data.cs.sfu.ca/). His research interests include Data Management, ML Debugging, and applying Machine Learning to real problems. Currently, he is leading the DataPrep project (https://github.com/sfu-db/dataprep), a Python library collection for doing data preparation.