Skip to content

Latest commit

 

History

History
98 lines (72 loc) · 1.84 KB

data.md

File metadata and controls

98 lines (72 loc) · 1.84 KB

Data

Big Data

See big-data.md

Data Validation

Start by validating data formats for correctness.

Scripts for this can be found in both the DevOps-Python-tools and DevOps-Bash-tools repos.

Then proceed to more advanced content validation.

Data Integration

  • Apache Camel - open source with 100+ connectors
  • Spring Integration - XML config, only use for Spring heavy shops
  • Mulesoft - XML config, only use for proprietary connectors

Apache Camel

https://camel.apache.org/

Open source integration framework written in Java.

TODO Book: Camel in Action

Glue between technologies and protocols

  • 100+ connectors
  • transactions
  • error handling
  • scalability
  • monitoring
  • Java, Groovy, Scala DSLs
  • std syntax for connector components
  • FuseIDE for camel drag + drop rather than coding
  • Enterprise ready 2007+

Marshals / Unmarshals Java beans to / from different protocols / data formats.

Data Formats and Compression

  • JSON
  • XML
  • Avro
  • CSV
  • YAML
  • Protobuf
  • gzip
  • zip

Technology Connectors & Protocols

https://camel.apache.org/components/latest/index.html

Highlights:

  • AWS services
  • ActiveMQ, ZeroMQ
  • Cassandra
  • CouchDB
  • Couchbase
  • Docker
  • Elasticsearch, Solr
  • File / directory watch ingest (Barclays used this)
  • FTP, DNS
  • GMail, Google Drive
  • Git / GitHub
  • HBase
  • HDFS
  • HTTP
  • Hazelcast
  • JDBC
  • JMS
  • Kafka
  • Kubernetes
  • LDAP
  • MongoDB
  • Redis
  • Splunk
  • Twitter
  • ZooKeeper

Mulesoft

  • lightweight enterprise service bus + integration framework
  • proprietary connectors
  • Anypoint Studio (Eclipse-based IDE)
  • Anypoint Enterprise Security - security features, transactions

Spring Integration

TODO

Ported from private Knowledge Base pages 2016+