diff --git a/Postgraduate/COMP6214 Open Data Innovation/2021-02-05-licensing-web-of-data-open-data-discovery.md b/Postgraduate/COMP6214 Open Data Innovation/2021-02-05-licensing-web-of-data-open-data-discovery.md new file mode 100644 index 0000000..f9b4b92 --- /dev/null +++ b/Postgraduate/COMP6214 Open Data Innovation/2021-02-05-licensing-web-of-data-open-data-discovery.md @@ -0,0 +1,16 @@ +# Licensing, Web of Data, and Open Data Discovery + +## 10 Open Data Principles + +1. Completeness (including metadata) +2. ... +3. Timeliness: release asap +4. Ease of physical and electronic access +5. Machine readability +6. Non-discrimination +7. Commonly owned or open standards +8. Licensing +9. Permanence +10. Usage costs + +## Open Data Licensing diff --git a/Postgraduate/COMP6214 Open Data Innovation/2021-02-05-structures.md b/Postgraduate/COMP6214 Open Data Innovation/2021-02-05-structures.md new file mode 100644 index 0000000..f9d1a2e --- /dev/null +++ b/Postgraduate/COMP6214 Open Data Innovation/2021-02-05-structures.md @@ -0,0 +1,22 @@ +# Open, Closed, Hybrid, and Open Data + +> Open Data means **anyone** can **freely access, use, modify, and share** for **any purpose** (subject, at most, to requirements that preserve provenance and openness). + +- Closed data requires permissions in order to access, use, modify, and share. +- Hybrid data is a limited presentation (open) to the proprietary data (closed). (?) + +**Linked Data** refers to a set of _best practices_ for publishing and interlinking structured data on the Web. + +- Open data is a campaign +- Linked Data can be appled to open, closed, or hybrid data. +- _Best practices:_ + - URIs are used as names for things + - HTTP URIs are used so that people can look up the names + - RDF, SPARQL are used to query information + - Other URIs are included for discovering more things + +## Structures of Data + +- Tabular +- Hierarchical +- Network (Graph) diff --git a/Postgraduate/COMP6214 Open Data Innovation/2021-02-08-data-cleaning.md b/Postgraduate/COMP6214 Open Data Innovation/2021-02-08-data-cleaning.md new file mode 100644 index 0000000..01d4034 --- /dev/null +++ b/Postgraduate/COMP6214 Open Data Innovation/2021-02-08-data-cleaning.md @@ -0,0 +1,42 @@ +# Data Cleaning + +Data Cleaning is the process of starting with (semi-)raw data from one or more sources and maintain reliable quality for your applications. + +Real-life data are often: + +- Incomplete +- Inconsistent +- Out-of-context +- ... + +It's important to keep a note of the changes while cleaning the data. + +## Types of Error in Real-Life Data + +- Syntactic: violation of domain constraints +- Semantic: discrepancies between values and the real one in real life + +## Properties of Clean Data + +### Information Completeness + +- Closed World Assumption (CWA): assuming the database has all real-world entities except some missing ones +- Open World Assumption (OWA): assuming the database misses related entities too + +### Data Currency + +Timeliness; not out-of-date. + +## Data Validation + +- Consumers understand the data easier +- Programmers do less "defensive programming" +- Producers can precisely define and validate the output + +## Tools + +- Linter: CSVLint, JSONLint + - for syntactic errors +- _OpenRefine_ +- _Excel_, or other spreadsheets +- _Bespoke Scripts_ diff --git a/Postgraduate/COMP6214 Open Data Innovation/2021-02-12-ontology-and-data-modelling.md b/Postgraduate/COMP6214 Open Data Innovation/2021-02-12-ontology-and-data-modelling.md new file mode 100644 index 0000000..324386e --- /dev/null +++ b/Postgraduate/COMP6214 Open Data Innovation/2021-02-12-ontology-and-data-modelling.md @@ -0,0 +1,9 @@ +# Ontology and Data Modelling + +Ontology is a model representing some subject matter or a domain. + +## Creating Your Own Ontology + +- Start with your own domain +- Generalise things +- Find relationships diff --git a/Postgraduate/ELEC6234 Embedded Processors/2021-02-10-mips.md b/Postgraduate/ELEC6234 Embedded Processors/2021-02-10-mips.md new file mode 100644 index 0000000..35357d3 --- /dev/null +++ b/Postgraduate/ELEC6234 Embedded Processors/2021-02-10-mips.md @@ -0,0 +1 @@ +# The MIPS Architecture diff --git a/Postgraduate/ELEC6242 Cryptography/2021-02-07-classic-ciphers.md b/Postgraduate/ELEC6242 Cryptography/2021-02-07-classic-ciphers.md new file mode 100644 index 0000000..bfbc8a4 --- /dev/null +++ b/Postgraduate/ELEC6242 Cryptography/2021-02-07-classic-ciphers.md @@ -0,0 +1,31 @@ +# Classic Ciphers + +Terminologies: + +- **Plaintext**: message in a "clear" form +- **Steganography**: message whose **existence is concealed (hidden)** +- **Cryptography**: messge in plain view, but the **meaning is concealed (hidden)**. +- **Cipher**: the operation on groups of **characters** + +## Compression & Encryption + +Compression and encryption are both about manipulating information (not data or wisdom). + +- _Compression_ extracts information from the data to encode it as efficiently as possible. +- _Encryption_ diffuse a **key** into information as much as possible, and encode it. + +## The Best Approach + +The best way to practically transmit secure data safely is to: + +1. Compress the data + - this **removes redundancy** in the plaintext + - this also makes encryption (which is slow) **faster** +2. Encrypt +3. Add error detection and recovery + +## Basic Cryptanalysis + +- Frequency analysis +- "Crib": a known sequence of letters or words, e.g. _q is almost always followed by u_ +- Make guesses