Data in the Wild – Data on the Internet


Data Sources for class discussion 9/3/2013 and 9/10/2013:

Data in text files: Working with .CSV, .XLS & .XLSX, .TXT and related data files :

There are many data sources available which provide data in a text format – either in fixed-width columns (typically .TXT) or with a delimiter such as a comma (typically .CSV) . In addition, many sites provide data in .XLS / .XLSX format for Excel or other spreadsheets.

As we have discussed … these data often need close scrutiny in order to work with format issues as well as data value exceptions which can corrupt your results if you do not properly test and “scrub” your data sets.

See also – Class 4 – Discussion on Data APIs