- Smart collar service
- Appendix
- References
Design a scalable system which can track pet(s). These pet activities are considered:
- Physical activities (run, walk)
- Barking and its duration
- Location tracking while running or walking on map (lat, long)
Smart collar sends various actions items to IoT device via bluetooth/RF. Every collar is uniquely identified via its deviceid. This id is used in the dynamodb database table's Partition Key.
IoT (Internet of Things) device is responsible to publish events to system via HTTP POST. Multiple collar devices can be connected to a IoT device.
Sample POST Graphql is covered under Query and Mutation section.
Lambda function is exposed http endpoints via API Gateway. This provides highly available and scalable solution. Caching by API Gateway can be enable to provide the latency of requests to APIs. The default TTL for API Gateway caching is 5 minutes.
Business rules are implemented in the AWS Lambda. Serverless or lambda is managed service provided by cloud providers, AWS in our implementation.
Lambda is completely even driven i.e. it will only run when invoked. This is perfect for application services having quiet periods followed by peaks in traffic.
Lambda really works well with API Gateway. Here sole purpose of using lambda behind API Gateway is to expose lambda function through HTTP endpoint.
Lambda can instantly scale up to a large number of parallel executions, for which the limit is controlled by the no of concurrent requests. Downscaling is handled by terminating the lambda functions as soon as it completes the operation.
Serverless implementation is cheap because it meant to be run seldom and you are only billed for the time it executes.
There is only one disadvantage: lambda has cold start. Only first request get response a bit slower. However this can be avoided by regular polling or using warmup plugin.
Dynamodb is used to persist data provided by collar.
DynamoDB is fully managed NoSQL database provided by AWS.
DynamoDB supports a document oriented data model. It means in order to create a table, we just need to define the primary key. Items can be added into the table with dynamic set of attributes. DynamoDB supports the following data types:
- Scalar Data types: Number, String, Binary, Boolean
- Collection Data types: Set, Map, List
DynamoDB tables require users to reserver Read Capacity Unit (RCU) and Write Capacity Unit (WCU) upfront. Mostly read tables should have higher RCU values than less read tables.
DynamoDB supports auto-sharding and load-balancing. This allows applications to transparently store ever-growing amounts of data.
DynamoDB is well integrated into the AWS ecosystem.
- Data can easily and cost-effectively be backed up to S3
- Easy to export data to Elastic MapReduce (EMR) for analysis
- Security and access control is integrated into AWS IAM
There are two types of primary key: a simple primary key made up of just a partition key, and a composite primary key made up of a partition key and a sort key.
The composite primary key enables sophisticated query patterns, including grabbing all items with the given partition key or using the sort key to narrow the relevant items for a particular query.
Indexing in DynamoDB give developer and analyst the ability to quickly speed up queries. The primary key is indexed by defaulted. Primary index works the same in DynamoDB as in Relational databases.
DynamoDB supports two other types of indexing called secondary indexing. DynamoDB supports Local Secondary Indexes (LSI) and Global Secondary Index (GSI).
-
LSI - uses the same partition key as the underlying table but a different sort key (can be any column). This index is created while creating table.
-
GSI - can define an entirely different primary key for a table. This could mean setting an index with just a partition key for a table with a composite primary key. It could also mean using completely different attributes to populate a partition key and sort key.
Cloudwatch is used for logging purpose.
- partitionKey - deviceId for smart collar
- sortKey - timestamp (unix millis epoch) and activity type. Here timestamp of action *
- createdAt - timestamp ISO format (UTC) when data is inserted in the database *
- actityType - type of activity (Bark, Physical activity or location)
- actionData - details of each activity
Note - Please note activity timestamp maybe different from timestamp of when inserted due to various factors such as IoT's internet disconnection.
Figure: Sample data in the database
Schema - this will define what we can query for and what custom data types we have.
Query - Anything of type Query or Mutation is something we can ask for in our query to our API. The semantic meaning of Query is that you want to fetch data.
Mutation - The semantic meaning of Mutation is that you want to create/update or delete data. (Other than reading).
Schema, Query function and Mutation are defined in the typedef
.
Resolver - a collection of functions that is able to respond to a request and ends up delivering a response
- Node 8.10.0 or higher
- Serverless 1.40.0 or higher
- Typescript 3.5 or higher
npm i -g serverless
npm i -g typescript
Once prerequisites have been installed, clone this project and run following command to fetch dependencies.
git clone https://github.com/pujansrt/SmartCollar.git
cd SmartCollar
npm i
Before running npm run offline
please make sure to execute tsc atleast once.
tsc
This command will run the project in 8080 port. Port can be changed in the package.json file. AWS default dynamodb database is used.
npm run offline
Now visit http://localhost:8080/graphql the browser to launch GraphQL Playground
Note: If you have local dynaomodb running for development purpose, you can run following command. Please change database settings in env.yml file accordingly
sls run offline --port 8080
In order to create new record in the database, execute following graphql query.
mutation {
createActivityItem(
data: {
partitionKey: "54668-30073-6ad9de2"
activityType: "BARK"
actionData: { duration: 13}
}
) {
items{
partitionKey
sortKey
activityType
createdAt
}
}
}
mutation {
createActivityItem(
data: {
partitionKey: "54668-30073-6ad9de2"
activityType: "PHYSICAL_ACTIVITY"
actionData: { duration: 13}
}
) {
items{
...
}
}
}
mutation {
createActivityItem(
data: {
partitionKey: "54668-30073-6ad9de2"
activityType: "LOCATION"
actionData: { location: {lat: "10.11", long: "99.19"}}
}
) {
items{
...
}
}
}
{
getActivityItems(partitionKey: "54668-30073-6ad9de2") {
items {
partitionKey
sortKey
activityType
createdAt
actionData{
duration
location{
lat
long
}
}
}
next {
partitionKey
sortKey
}
}
}
{
getActivityItems(partitionKey: "54668-30073-6ad9de2", pageSize:25, activityType: "LOCATION") {
items {
partitionKey
sortKey
activityType
createdAt
actionData{
duration
location{
lat
long
}
}
}
next {
partitionKey
sortKey
}
}
}
{
getActivityItems(
partitionKey: "54668-30073-6ad9de2"
pageSize: 5
startTime: "1533061800000"
endTime: "1565002351704"
) {
items {
partitionKey
sortKey
activityType
createdAt
actionData {
duration
location {
lat
long
}
}
}
next {
partitionKey
sortKey
}
}
}
Default pagesize is 10. This pageSize can be passed via query argument as shown in the previous query section.
If HTTP Response has "next" attributes then it means there are more pages. Next pages' query should pass partitionKey as well as sortKey in the query argument.
These are activity types defined in serverless.yml
- BARK
- PHYSICAL_ACTIVITY
- LOCATION
You may add new activity type in serverless.yml file.
Following command will run tests and generate code coverage report
npm run test
maxAge for cache-control is set for 30 seconds. This can be changed in the src/v1/graphql/handler.ts file
cacheControl: {
defaultMaxAge: 30
}
Error message (without stacktrace) can be propagated to HTTP response to client. You can see src/v1/service/activity.service.ts
for more details.
if (...) throw new Error("Error message to propagate");
This configuration is added in the ApolloServer (src/v1/graphql/handler.ts)
formatError: (err: any) => {
if (err.message.startsWith("Database Error: ")) {// Don't give the specific errors to the client.
return new Error('Internal server error');
}
return {message: err.message};
},
src/v1/graphql/handler.ts
exports.graphql = server.createHandler({
cors: {
origin: "*",
credentials: true
}
});
- SQS Inclusion - Introduction of SQS so that IoT device can send request to SQS without waiting for save into database (fire and forget). This will improvement performance of database writes significantly. It will be service's job to get queued item from SQS and write to the database.
- One service for receiving IoT requests and another service for serving response purpose (client facing)
- SNS can be used for alarming purpose
Multiple AWS Region architecture is good for following reason:
- It improves latency for end-users - (latency based routing)
- Disaster recovery - If one region goes down, all users are diverted to different available region.
When lambda functions fail, they are retried. If the retries also fail, AWS has a feature to send information about the failed request to a SNS topic or SQS queue, called the Dead Letter Queue, which you can use to track and diagnose and react to lambda failures.
Following line is added in the serverless.yml file
onError: arn:aws:sqs:us-east-1:XXXXXX:test
When developing, local dynamodb can be used.
{
"AttributeDefinitions": [
{
"AttributeName": "partitionKey",
"AttributeType": "S"
},
{
"AttributeName": "sortKey",
"AttributeType": "S"
}
],
"KeySchema": [
{
"AttributeName": "partitionKey",
"KeyType": "HASH"
},
{
"AttributeName": "sortKey",
"KeyType": "RANGE"
}
],
"ProvisionedThroughput": {
"ReadCapacityUnits": 5,
"WriteCapacityUnits": 5
},
"TableName": "pujan"
}