amazon web services - Modeling data in NoSQL DynamoDB -
i'm trying figure out how model following data in aws dynamodb table.
i have lot of iot devices, each sends telemetry data every few seconds.
attributes
- device_id
- timestamp
- malware_name
- company_name
- action_performed (two possible values)
queries
- show incidents happened in last week.
- show incidents specific device_id.
- show incidents action "unable_to_remove".
- show incidents related specific malware.
- show incidents related specific company.
thoughts
i understand can add gsi's each attribute, use gsi's if there no other choice costs me more money.
what main primary-key (partition-key:sort-key) ?
please share thoughts, care them more care perfect answer i'm trying learn how think , consider instead of having answer specific question.
thanks lot !
if absolutely need querability patterns mentioned, have no way out create gsis each. has set of caveats:
- for query #1, gsi
incident_date
(or whatever) partition-key ,device_id
sort-key. might lead hot partitioning in dynamodb, based on access patterns. - there limit of 5 gsis per table, you'll use right away. what'll if need support kind of query in future?
while evaluating pros , cons of using nosql given situation, 1 needs consider both read , write access patterns. so, question should ask is, why dynamodb?
for e.g., really need realtime queries? if not, can use dynamodb main database , periodically sync data (using aws lambda or kinesis firehose) emr or redshift later batch processing.
edit: proposed primary key:
device_id
partition-key ,incident_date
sort-key, if know no 2 or more incidents, givendevice_id
, can come @ exact same time.- if above doesn't work,
incident_id
partition-key ,incident_date
sort-key.
Comments
Post a Comment