Monday, January 30, 2012

Apples and Oranges (GA144 and ARM Cortex A8)

In my previous posts I mention about how multiple CPU processors such as a GA144 are different than multi-core CPUs.  I also talked about how having multiple independent processes may work better some problem spaces.  In broad terms, the GA144 can be viewed as a very low level Erlang for modeling lightweight (low memory, low computationally-complex) problems. Viewing it this way, it really isn't competing (for my purposes -- a sensor base station) with most (bare) MCUs.
(Additional similarity:  Forth, like Erlang frowns upon having lots of variables so data is carried as functions/parameters (Erlang) or the words/stack (Forth).)

Now, if you throw an ARM + Linux + Erlang ( at my sensor base station, what do you get?  (If Erlang doesn't really work well on the ARM, replace it with your favorite language plus lots of processes/threads. Also, keep in mind that my sensor base station needs to run for days on battery backup.)

Now, let's pick an ARM/Linux system for comparison:  How about the beagleboard bone?
This $89 beauty looks really appealing. I could see using it as my base station.  It is based on a new Cortex A8 and is feature rich for its size.

I can't compare (yet) how well it would do against a GA144. The GA144 certainly looks anemic compared to it (from just the Cortex A8 perspective).

However, I can take a quick look at power:

  • The Beagleboard bone consumes 170mA@5VDC with the Linux kernel idling and 250mA@5VDC peak during kernel boot. (from pages 28-29 of )
  • The GA144 consumes 7uA@1.8VDC (typical)  with all nodes idling and 540mA@1.8VDC (typical)  with all nodes running. (from the G144A12 Chip Reference).
Of course, you can't directly compare the two, but consider this interesting tidbit: The power performance of the GA144 is directly related to how many nodes you run.

I haven't looked at any performance numbers between the two, but I'll wager that the Cortex A8 ultimately outperforms the GA144.  But, my sensor base station is neither CPU bound (no complex calculations) nor RAM bound (important data points are persisted in flash storage and fetched as needed).

The real question is: How much useful work can I get done in 1 node?

Saturday, January 28, 2012

GA144 as a low level, low energy Erlang

I've been reading some of the newsgroup threads regarding the GA144 and most of the critiques have been mostly from a low level perspective (GA144 vs FPGAs vs ARMs, etc).  One can argue that the processor can't compete when considering the anemic amount of memory each node has and the limited limited peripheral support.  But, let us put aside that argument for a moment (we'll get back to it later).

Here I primarily want to discuss the GA144 from a software (problem solving) architecture perspective. This is where my primary interests reside. I'd lilke the GA144 to have formal and flexible SPI support. I'd like to see more peripheral capability built in. I'd like to see 3.3-5VDC support on I/O pins so level shifter chips aren't needed.  But, I think the potential strong point of GA144 is in what you can do with the cores from a software (problem solving) architectural design perspective.

Notice I keep mentioning software and problem solving together?  I want to make clear that I am not talking about software architecture in terms of library or framework building.  I'm talking about the architecture of the solution space.  I'm talking about the software model of the solution space.
Let's look at an analogy.

If I were to build a large telecommunication switch (handling thousands of simultaneous calls) and I implemented the software in Erlang or C++ (and assuming that they both would allow me to reach spec -- maybe 99.9999% uptime, no packet loss, etc.) at the end of the day you wouldn't be able to tell the system performance apart.

However, one of the (advertised) benefits of Erlang is that it allows you to do massive concurrency. This is not a performance improvement, but (perhaps) a closer model to how you want to implement a telco switch.  Lots of calls are happening at the same time. This makes the software easier to reason about and (arguably) safer -- your implementation stays closer to the solution space model.

Do you see what I am getting at?

Now, let's look at the problem I've been talking about here on the blog (previously described in vague terms): I want to integrate  dozens of wireless sensors to a sensor base station.  The base station can run off of outlet power but must be able to run off a battery for days (in case of a power outage). It is designed to be small and discrete (no big panel mounted to the wall with UPS backup). It needs to run 24/7 and be very fault tolerant.

The sensors are small, battery efficient and relatively "dumb". Each samples data until it reaches a prescribed threshold (perhaps performing light hysteresis/averaging before wasting power to send data) and it is up to the sensor base station to keep track of what is going on. The base station collects data, analyzes it, tracks it and may send alerts via SMS or perhaps just report it in a daily SMS summary.

Let's consider one typical sensor in my system: A wireless stove range monitor. This sensor, perhaps mounted to a range hood, would monitor the heat coming from the burners. This sensor will be used to (ultimately) let a remote individual know (via SMS) that a stove burner has been left on an unusually long time.  Perhaps grandma was cooking and forgot to turn the burner off.

This stove range sensor probably shouldn't be too smart. Or, in other words, it is not up to it to determine if the stove has been on "too long".  It reports  a temperature reading once it recognizes an elevated temperature reading over a 5 minute interval (grandma is cooking).  It then continues to report the temperature every 10 minutes until it drops below a prescribed threshold.   This is not a "smart" sensor. But it is not too dumb (it only uses RF transmit power when it has sufficiently determined significant temperature events -- rather than just broadcasting arbitrary temperature samples all day long).

The sensor base station will need a software model that takes this data, tracks it and makes a determination that there is an issue. Just because the stove range is on for a few hours may not mean there is a problem. A slow elevated temperature rise followed by stasis may suggest that a pot is just simmering.  However, if the stove is exhibiting this elevated temperature past 11pm at night -- certainly grandma isn't stewing a chicken this time at night!    You don't want to get too fancy, but  there can be lots of data points to consider when using this stove range monitor.

Here is my solution model (greatly simplified) for this sensor monitor:

  1. Receive a temperature sample
  2. Is it at stasis? If so, keep track of how long
  3. Is it still rising? Compare it with "fire" levels -- there may be no pot on the burner or it is scorching
  4. Is the temperature still rising? Fast?  Send an SMS alert
  5. Is it on late in the evening?  Send an SMS alert
  6. Keep a running summary (timestamped) of what is going on. Log it.
  7. Every night at 11pm, generate a summary of when the range was used and for how long. Send the summary via SMS
Imagine this as a long running process. It is constantly running, considering elapsed time and calendar time in its calculations.

Now, this is just one of many types of sensor that the base station must deal with. Each will have its own behavior (algorithm).

I can certainly handle a bunch of  sensors with a fast processor (ARM?). But my software model is different for each sensor. Wouldn't it be nice to have each sensor model to be independent? I could do this with Linux and multiple processes. But, really, the above model isn't really that sophisticated. It could (perhaps) easily fit in a couple of GA144 nodes (the sensor handler, logger, calendar and SMS notifier would exist elsewhere). And It would be nice to code this model as described (without considering state machines or context switches, etc).

So, back to the argument at the top of this post... I don't care if the GA144 isn't a competitive "MCU". My software models are simple but concurrent.  My design could easily use other MCUs to handle the SMS sending or RF radio receives. What is important is the software model. The less I have to break that up into state machines or deal with an OS, the better.

This is my interest in the GA144: A low level, low energy means of keeping my concurrent software model intact.  I don't need the GA144 to be an SMS modem handler.  I don't need it to perform the duties of a calendar. I need it to help me implement my software model as designed.

Monday, January 23, 2012

True Modular computing

I want to tie together my past 3 posts.
In Multi-computers vs multitasking vs interrupts I stated a problem (concurrent processing within the embedded realm) and teased you with a solution (GreenArray's GA144).

In Costs of a multiple mcu system I backed a bit away from the GA144 and wondered if a bunch of small, efficient and cheap MCUs could solve the problem.

In Building devices instead of platforms  I offered a rationale to my pondered approaches.

So, here I sit composing an interrupt for an 8051 to service a GSM modem's UART (essentially to buffer all the incoming data).   And... everything has just gotten complicated.  I've been here before. I am no stranger to such code and the approach I am taking is text book.  This is how code starts to get hairy and unpredictable.

But, really now... maybe I *should* consider breaking my tasks down into hardware modules (with each module consisting of dedicated software).  If I dedicated an 8051 (tight loop, no interrupts) to just talking to  the modem, collecting responses and sending just the relevant information to another 8051 (perhaps through I2C or SPI), then I build that module once, debug it once and be done with it.

This is modular design, isn't it?

So (during design) every time I find a need for an interrupt, I just fork another processor?

(This would work with a single GreenArray's GA144 as well as $3 Silabs 8051 MCUs)

Building devices instead of platforms

The title of this posting is a concept that flies in opposition to conventional wisdom.  Everything seems to be a platform these days. When you buy a gadget or appliance you are not buying a device, you are investing in a platform. Refrigerators are appearing with Android embedded.  We are looking at a future of doing "software updates" to our appliances!

Of course, there is big talk about the "Internet of Things", and that could be grand (my washing machine could one day query my fridge about the tomato sauce stain it encountered on my shirt and then place an order on Amazon for the correct stain treatment product).

But, the consider the "elephant in the room": these devices will suffer from that dreaded of all software plagues: "ship now; fix later in an update".

This doesn't tend to happen when you talk about your appliances and devices containing "firmware" (in the original sense). My washing machine has embedded microprocessors but there is no evident way to upgrade the firmware. Hence, the engineers have got to get it right before it ships.  The software is apparently sophistica Of course, this is not always the case, but the "get it right" mindset is there. You don't want to tell people they have to call a service technician and "upgrade" their washing machine when it locks up.

For all the smarts my washing machine has, it is still a "device" (not a platform).

This rant bleeds into the common "old fogy" rant that a cell phone should be (just) a phone.  I don't think we need to start limiting the potential of our devices, but there are some that should just "work".
We are losing that when we start designing a device and our first step is to choose an OS.

Friday, January 20, 2012

Costs of a multiple mcu system (back of envelope)

If physical size isn't an issue (you don't want the tiniest of footprints)  and unit cost isn't measured in cents, have you considered a multi-mcu system?  If your system is highly interrupt driven (receiving lots of I/O from more than 1 place), then you'll need either an OS or at least some solid state based design to handle the concurrency.

If I can get 3 Silabs 8051 variants for between $12-$15, I would need just around 6 caps and resistors to get it up and running.  So, total MCU costs would be $15 max. These old 8-bit workhorses are self contained systems. They rarely need an external crystal, often come with built in regulators, and are meant to have a low passive component count.  You can just "drop" them in with just a few millimeters of board space.

What does this get me? Potentially less complexity for the software.  Consider assigning your MCUs as thus: Each I/O subsystem gets its own MCU for processing.  You can design, code and test each subsystem separately.  With a means of message passing (SPI, shared memory/flash, etc) you now have an integrated system.

This is hardware based functionality factoring.  The industry already does this. If you have bought a GPS module, a cellular modem or even an (micro)SD card you have bought an MCU+X (where X is the functionality).  Peripherals often come with their own embedded MCUs (with custom firmware) .  However, we expect those peripherals to be (relatively) flawless.

Software, on the other hand, is under constant revision,  bug fixes and improvements.

Consider this: If you buy a memory hardware module that has built in support for FAT filesystems (maybe accessed via a UART or SPI), then you expect that hardware module to work perfectly. It is a $5 piece of hardware that should work out of the box. There is no notion of "field" upgrades.

However, if you have bought (or downloaded) a FAT filesystem as a software library, you'll see a few revisions/improvements with a release cycle. It doesn't have to work perfectly. You are expected to upgrade occasionally.

Hardware is getting cheap enough that (for small unit counts) we should seriously consider multiple MCU systems.

Curiously, GreenArrays has side stepped this issue and simple incorporated a bunch of small MCUs into one package.

Multi-computers (GA144) vs multitasking (ARM) vs interrupts (8051/MSP430)

You've got a system to design.  It is a multi-sensor network that ties to a small, efficient battery run base station. The sensor nodes are straightforward. You start thinking about the base station:

I've got a barrage of data coming at me over the air at 433Mhz.  I have a simple 2 byte buffer on the SPI connected transceiver so I don't have a lot of time to waste on the MCU. I must be there to receive the bytes as they arrive.
Once I've received the data, it must be parsed into a message, validated, logged (to persistent storage) and perhaps correlated with other data to determine if an action must be taken. An action can also be timer based (scheduled). Oh, and I must eventually acknowledge the message receipt or else the sender will keep sending the same data.
Additionally (and unfortunately), the action I need to perform may involve dialing a GSM modem and sending a text message. This can take some time. Meanwhile, data is flowing in.  What if a scheduled event must take place and I'm in the middle or processing a new message?

Now, this is the sort of thing that you would throw a nice hefty ARM at with a decent  OS (maybe linux) to do multitasking and work queue management.  But, let's think a second... Most of what I've just described works out to a nice simple flow diagram: Receive data -> parse data into message -> reject duplicate messages -> log message -> correlate message with previous "events" -> determine if we need to send a text message -> send text messages at specific times -> start over.

Each task is pretty straight forward.  The work is not CPU bound.  You really don't need a beefy ARM to do each task. What we want the ARM to do is to coordinate a bunch of concurrent tasks. Well that will require a preemptive OS.  And then we start down that road...  time to boot, link in a bunch of generic libraries, think about using something a little more high level than C, etc. We now have a fairly complex system.

And, oh... did I mention that this must all run nicely on a rechargeable battery for weeks?  And, yank the battery at any time -- the system must recover and pick up where it left off.   So, just having a bunch of preemptive tasks communicating via OS queues isn't quite enough. We will probably need to persistent all queued communication.  But I am getting distracted here.  The big system eats a little bit too much power...

Okay, so we get a bit smaller. Maybe a nice ARM Cortex-M3 or M0.  Okay, the memory size is reduced and its a bit slower than a classic ARM.  A preemptive OS starts to seem a bit weight-y.

So, how about a nice MSP430 (a big fat one with lots of RAM and program space).  Now start to think about how to do all of that without a preemptive OS (yes, I know you can run a preemptive time-sliced OS on the MSP430, but that is a learning curve and besides you are constraining your space even further).  Do you go with a cooperative OS?  Well, now you have to start thinking about a states... how do you partition the tasks into explicit steps?  At this point you start thinking about rolling your own event loop.

So, then there are the interrupts:

Oh, I forgot about that.  The SPI between the MSP430 and transceiver. The UART to the GSM modem. And the flash. Don't forget the persistent storage.  And  check pointing, the system needs to start where we left off if the battery runs out.

Okay, you are getting desperate. You start thinking:

 What if I had one MCU (MSP430, C8051, whatever) dedicated to the transceiver and one dedicated to the GSM modem.  The code starts to get simpler.  They'll communicate via the persistent storage.... But, who handles the persistent storage?  Can the transceiver MCU handle that too?  

This is where thoughts of multi-computers  come in. Not multi-cores (we are not CPU bound!), but multi-computers.  What if I had enough computers that I could dedicate each fully to a task? What if I didn't have to deal with interrupts?  What would these computers do?

- Computer A  handles the transceiver
- Computer B  handles logging (persistent storage interface, indexing, etc)
- Computer C  handles the GSM modem (AT commands, etc)
- Computer D  handles parsing and validating the messages
- Computer E  handles "scheduling" (time based) events
- Computer F handles check pointing the system (for system reboot recovery)
etc etc etc

This is where I start really, really thinking that I've found a use for my GA144 board: Performing and coordinating lots of really simple tasks.

It becomes GA144 vs ARM/Cortex + Linux.

Now if I can only figure out how to do SPI in arrayForth...