Merge pull request #34 from DHRI-Curriculum/dyoong-suggested-terms

Adding terms for data literacies
DHRI-Curriculum · Aug 4, 2020 · 5b3386a · 5b3386a
2 parents 616ecb1 + 98b3f3c
commit 5b3386a
Show file tree

Hide file tree

Showing 12 changed files with 137 additions and 0 deletions.
diff --git a/terms/csv.md b/terms/csv.md
@@ -0,0 +1,11 @@
+# CSV (file format)
+
+CSV or Comma Separated Values uses---you guessed it!---commas to separate values. Each line (First Name, Last Name) is a new "record" and each column (separated by a comma) is a new "field." This data format stores tabular data in a clean way that facilitates the transfer between different data architectures. As data types go, it is very rudimentary (even predating computers!) and is easy to type, without needing special characters beyond a comma.
+
+```
+First Name,Last Name
+Smally,McTiny
+Kitty,Kitty
+Foots,Smith
+Tiger,Jaws
+```
diff --git a/terms/data.md b/terms/data.md
@@ -0,0 +1,9 @@
+# Data
+
+There are many different perspectives towards what counts as data. Some cites data as "material or information" for which "an argument, theory, test or hypothesis, or another research output is based" upon ([Queensland University of Technology](http://www.mopp.qut.edu.au/D/D_02_08.jsp)), while others critiques the understanding of data as "mere descriptions ofa priori conditions" ([Johanna Drucker](http://www.digitalhumanities.org/dhq/vol/5/1/000091/000091.html)). Data, in our case, are subjective (because of our interests and assumptions) and are materials and/or information necessary to come to our conclusion. 
+
+## Readings
+
+- Johanna Drucker's [Humanities Approaches to Graphical Display](http://www.digitalhumanities.org/dhq/vol/5/1/000091/000091.html)
+- Matthew Salganik's [Readymade v. Custommade Data](https://www.bitbybitbook.com/en/1st-ed/introduction/themes/)
+- Catherine D'Ignazio and Lauren Klein's [The Numbers Don't Speak for Themselves](https://data-feminism.mitpress.mit.edu/)
diff --git a/terms/desc-analysis.md b/terms/desc-analysis.md
@@ -0,0 +1,9 @@
+# Descriptive Analysis
+
+Descriptive analysis are techniques geared towards summarizing a data set, such as:
+
+- Mean
+- Median
+- Mode
+- Average
+- Standard deviation
diff --git a/terms/high-quality-data.md b/terms/high-quality-data.md
@@ -0,0 +1,5 @@
+# High Quality Data
+
+High quality data is often understood as valid, accurate, complete, consistent, and uniformed. This is often achieved through the cleaning process. 
+
+Measurements are valid when they conform to set constraints. They are accurate when they represent the correct values (often requiring cross-referencing trusted external sources). They are complete when they represent everything that might be known and are consistent when observations do not contradict each other. Measurements are uniform when the same unit of measure is used in all relevant measurements.  
diff --git a/terms/inferential-analysis.md b/terms/inferential-analysis.md
@@ -0,0 +1,6 @@
+# Inferential Analysis
+
+Inferential analysis are techniques geared towards testing a hypothesis about a population, based on your data set, such as:
+
+- Extrapolation
+- P-Value calculation
diff --git a/terms/json.md b/terms/json.md
@@ -0,0 +1,26 @@
+# JSON (file format)
+
+JSON or JavaScript Object Notation, also uses a nesting structure, but with the addition of "key/value" pairs, like the firstName key which is tied to the `Smally` value (at least for the first cat!). JSON is popular with web applications that save and send data from your browser to web servers, because it uses the main language of web browsers, JavaScript, to work with data.
+
+```json
+{
+    "Cats": [ 
+        {
+            "firstName": "Smally",
+            "lastName": "McTiny"
+        }, 
+        {
+            "firstName": "Kitty",
+            "lastName": "Kitty"
+        },
+        {
+            "firstName": "Foots",
+            "lastName":"Smith"
+        }, 
+        {
+            "firstName": "Tiger",
+            "lastName":"Jaws"
+        } 
+    ]
+} 
+```
diff --git a/terms/open-data-formats.md b/terms/open-data-formats.md
@@ -0,0 +1,8 @@
+# Open Data Formats
+
+Open data formats are file formats that are available to anyone, free of charge, which allows for accessibility, future-proofing, and preservation. These file formats also allow for easy reusability and aids research reproduction and accountability. They are not limited by intellectual property rights or copyrights. This is distinct from proprietary formats. Some examples of open data formats are .csv, .pdf, and .json.
+
+## Readings
+
+- Library of Congress [Recommended Formats Statement](https://www.loc.gov/preservation/resources/rfs/)
+- Stanford University's [best practices for file formats](https://library.stanford.edu/research/data-management-services/data-best-practices/best-practices-file-formats)
diff --git a/terms/proprietary-data-fornats.md b/terms/proprietary-data-fornats.md
@@ -0,0 +1,8 @@
+# Proprietary Data Formats
+
+Proprietary data file formats are file formats that rely on dedicated, licensed softwares and/or systems. These file formats are often copyrighted, patented, or have other restrictions placed on them, and often require a fee or a paid-for software to open. These file formats are usually discouraged in research projects, especially those with intentions to share with a wider public(s) and audience. This is distinct from open data formats. Some examples of it include .xslx, .doc, and .3ds. 
+
+## Readings
+
+- Library of Congress [Recommended Formats Statement](https://www.loc.gov/preservation/resources/rfs/)
+- Stanford University's [best practices for file formats](https://library.stanford.edu/research/data-management-services/data-best-practices/best-practices-file-formats)
diff --git a/terms/qual-analysis.md b/terms/qual-analysis.md
@@ -0,0 +1,11 @@
+# Qualitative Analysis
+
+Qualitative analysis are techniques geared towards understanding a phenomenon, rather than predicting and testing hypotheses, such as:
+
+- Grounded Theory/Computational Grounded Theory
+- Content Analysis
+- Text Analysis
+
+## Readings
+
+- [Computational Grounded Theory: A Methodological Framework](https://drive.google.com/file/d/0BxI6W5IIG74FeEtGbjQ0WF9uM0U/view)
diff --git a/terms/raw-data.md b/terms/raw-data.md
@@ -0,0 +1,9 @@
+# "Raw" Data
+
+"Raw" data is yet to be processed, meaning it has yet to be manipulated by a human or computer. Received or collected data could be in any number of formats, locations, etc.. It could be in any of the forms listed in the previous section.
+
+But "raw" data is a relative term, inasmuch as when one person finishes processing data and presents it as a finished product, another person may take that product and work on it further, and for them that data is "raw" data. 
+
+## Readings
+
+- Johanna Drucker's [Humanities Approaches to Graphical Display](http://www.digitalhumanities.org/dhq/vol/5/1/000091/000091.html)
diff --git a/terms/tidy-data.md b/terms/tidy-data.md
@@ -0,0 +1,11 @@
+# Tidy Data
+
+Tidy data are a way of processing and organizing data in to a data structure that follows these rules:
+
+1. Each variable is in a column.
+2. Each observation is a row.
+3. Each value is a cell.
+
+## Readings
+
+- [Tidy Data](https://www.jstatsoft.org/article/view/v059i10)
diff --git a/terms/xml.md b/terms/xml.md
@@ -0,0 +1,24 @@
+# XML (file format)
+
+XML or eXstensible Markup Language is a file format that uses a nested structure where the "tags" like `<Cat>` contain other tags inside them, like `<firstName>`. This format is good for organizing the layout of a document in a tree-like format, just like HTML, where we want to nest elements like a sentence within a paragraph, for example. XML does not carry any information about how to be displayed and can be used in a variety of presentation scenarios. 
+
+```xml
+<Cats> 
+    <Cat> 
+        <firstName>Smally</firstName> 
+        <lastName>McTiny</lastName> 
+    </Cat> 
+    <Cat> 
+        <firstName>Kitty</firstName> 
+        <lastName>Kitty</lastName> 
+    </Cat> 
+    <Cat> 
+        <firstName>Foots</firstName> 
+        <lastName>Smith</lastName> 
+    </Cat> 
+    <Cat> 
+        <firstName>Tiger</firstName> 
+        <lastName>Jaws</lastName> 
+    </Cat> 
+</Cats> 
+```