Skip to main content
Cloud

IoT on Cloud: The Architecture Decisions That Determine Cost

IoT platform bills get ugly fast when early architecture decisions go unchallenged. Six decisions that keep IoT cloud costs sane.

John Lane 2023-02-01 5 min read
IoT on Cloud: The Architecture Decisions That Determine Cost

IoT is the cloud workload most likely to surprise a customer on their monthly bill. Ingestion charges, message routing fees, timeseries storage, rule engine invocations, and egress all compound in ways that look reasonable at pilot scale and catastrophic at production scale. After running IoT projects across manufacturing, building automation, and fleet customers, we have a short list of architecture decisions that determine whether an IoT deployment is sustainable.

1. Ingestion Cadence Is Your Single Biggest Cost Lever

The first question on any IoT project should be: how often do devices actually need to send data? The answer is almost never "every second." Vendors default to high-frequency telemetry because it makes dashboards look alive in a demo. In production, most business decisions can be made on data sampled every 30 seconds, every minute, or even every 15 minutes.

Cut your ingestion rate by 10x and your ingestion bill drops by roughly 10x. Storage drops. Processing drops. Egress drops. This is the single lever that matters most.

How to pick a cadence

  • Critical alarms and state changes: event-driven, not polled. Send only when something crosses a threshold.
  • Operational dashboards: 30 seconds to 5 minutes is plenty.
  • Analytics and trending: 1 to 15 minutes is usually fine, and you can aggregate on the edge first.
  • Regulatory data logging: the cadence your regulator actually requires — which is almost always slower than your instinct.

Edge aggregation

If you need per-second data for local control loops, do the control at the edge and send summarized data to the cloud. A Raspberry Pi or an industrial gateway can hold a rolling buffer, compute min/max/avg, and publish the summary. This is how every experienced IoT architect ends up building systems regardless of what the cloud vendor demo showed.

2. Device Identity and Provisioning Will Define Operational Cost

The demo with ten devices hides every problem that shows up at ten thousand. Certificate management, fleet provisioning, rotating credentials, revoking compromised devices — these are the things that determine whether you have an operational nightmare two years in.

AWS IoT Core, Azure IoT Hub, and GCP IoT Core (discontinued, migrate to alternatives) all have fleet provisioning and device identity features. Use them from the start even if you only have ten devices. Rolling your own certificate authority for IoT is a rite of passage and a bad idea. The cloud providers have done this work; let them.

For OTA firmware updates, budget the development work honestly. OTA is one of those features that looks simple and becomes a significant project when you account for rollback, staged rollouts, and the devices that are offline when the update drops.

3. Message Routing Is Usually Where the Bill Gets Weird

The pattern that reliably goes over budget is "route every message to every downstream system." If you have a device sending temperature readings and you route them to a timeseries database, a data lake, a rule engine, an alerting service, and a dashboard feed, you are paying for five copies of that message.

The better pattern is one canonical ingestion path (device → MQTT broker → message bus) and then consumers that subscribe to what they actually need. A streaming system like Kinesis, Event Hubs, or Kafka sits between the device ingestion and the downstream consumers. Each consumer pulls only the messages it cares about.

This reduces the unit economics dramatically and also gives you a natural audit trail. If a dashboard is wrong, you can replay the stream to the consumer and see exactly what it got.

4. Timeseries Storage Is Not Database Storage

The instinct to dump IoT data into Postgres or MongoDB fails at surprisingly low volumes. These are general-purpose databases and timeseries data has a very specific access pattern: write-heavy, append-only, range-scan on read, aggregated more often than not.

Timescale, InfluxDB, AWS Timestream, and Azure Data Explorer are built for this. They compress repetitive data aggressively (10x to 100x over raw Postgres), they handle retention policies natively, and they have functions for downsampling and continuous aggregation that are extremely painful to reproduce in a general-purpose DB.

Our default pattern is timeseries DB for hot data (last 30 to 90 days), object storage for cold data (everything older), and a query layer that can span both when someone needs to look at a year of trends. Storage costs drop by more than an order of magnitude compared to keeping everything in a hot database.

5. Analytics Can Be Done in Batch

The fashion in IoT marketing is realtime everything. In reality, most analytics — dashboards, reports, anomaly detection, predictive maintenance models — can run in batch every minute, every five minutes, or every hour without any user noticing. Batch processing is cheaper and simpler than streaming processing by a significant margin.

Reserve streaming analytics for the cases where you actually need a sub-second reaction: safety-critical alarms, fraud detection, control loops. Everything else can run in dbt or a scheduled Spark job. Your operations team will thank you and your bill will be half what it would be otherwise.

6. Edge Connectivity Is an Ongoing Cost, Not a One-Time Purchase

The device side of IoT is where total cost of ownership gets slippery. Cellular data plans, LoRaWAN gateways, private LTE infrastructure, WiFi coverage in industrial facilities — all of these have monthly costs that do not show up in the pilot. Whoever builds your pilot usually ignores them. Whoever operates the production system discovers them.

Before committing to a cloud IoT architecture, price the connectivity side honestly:

  • Cellular is flexible but metered. A data plan at $5 to $15 per device per month adds up fast.
  • LoRaWAN is cheap per device but requires gateway infrastructure.
  • WiFi is free until you have to run it into a warehouse with cinderblock walls and forklifts.
  • Private 5G is becoming viable for large facilities and is worth pricing if you have scale.

Design your device telemetry cadence with the connectivity cost in mind. A device that sends a kilobyte of JSON every minute on cellular is meaningfully more expensive than one that sends a compact binary payload every five minutes.

Three Takeaways

  1. Sample as slowly as your use case tolerates. High-frequency telemetry is the default failure mode of IoT projects.
  2. One ingestion path, many consumers. Do not let every downstream system charge you for the same message.
  3. Price connectivity honestly at design time. The device side of IoT is where pilots turn into surprise bills.

Talk with us about your infrastructure

Schedule a consultation with a solutions architect.

Schedule a Consultation
Talk to an expert →