Have Data, Need Iterator
As part of the Embedded Demo Project I am implementing a
more full-featured i2c driver for the STM32. It would be nice if the driver’s write function
can just operate with any iterable container. Give the write operation the destination
address and a series of bytes, and bam… you have a blinky LED or maybe data stored to
an SD card.
Communication on slow serial interfaces like i2c benefits from hardware accelerators. The particulars of these accelerators vary but the generic version is:
- configure hardware/communication particulars
- load data into the output buffers
- tell the hardware to go
It is not uncommon that you want to write more data than the output buffer holds at once. As such, the software iterates through the data and loads the output buffer with each chunk. This is where DMA comes in handy … but that will be a future post.
In our i2c write example, we will:
- set up the controller
- write 1 byte to the output buffer
- wait for an event to indicate the output buffer is empty
- return to step 2 until all data is written
Now I hear you saying, “I thought you already demonstrated this in your CppCon 2025 Groov talk.” In that talk I took advantage of knowing the size of the data array at compile-time. I was able to simply construct the sender chain at compile-time with the proper number of writes:
auto values = stdx::bit_unpack<std::uint8_t>(data);
auto write_data =
std::apply([](auto ... v, auto last) {
return
(async::seq(write_byte(v))
| ...
|async::seq(write_last_byte(last) ));
}, values);
This is great because the entire write chain is composed at compile-time, offering additional optimizations. But what if we don’t know the size at compile-time? What if all we have is the start and end iterators? What can we do?
If this were plain-old sequential code, we might write something like:
while (iter != iter_end) {
write_byte(*iter);
++iter;
}
This is great if we don’t mind blocking with our write_byte operation but in our event-driven
system that needs to handle many activities, this is going to be a non-starter.
Asynchronous Functions
Senders provide a nice abstraction so that we can reason about asynchronous functions.
A | B | C
A is followed by running B with the results of A which is followed by running C with
the results of B. Anyone who has dealt with a Unix shell will feel at home. As the user of
this construct we don’t need to think about the asynchronicity (more on this in a future blog
post).
We are left with some questions for a sender chain:
- how are we going to loop?
- how will we evaluate when we are done looping?
- how do we control the scope of variables?
What seems trivial in our sequential code now seems a bit more daunting.
Write, Increment, Repeat
Perusing the Intel C++ Baremetal Senders and Receivers documentation, we will notice the algorithm repeat. This seems promising.
auto send_data = write_byte() | repeat();
If we can:
- get the current data word to
write_byte - increment to the next data word
repeatif we aren’t at the end of the data list
then this might be a direction.
There are three variations of
repeat:
repeat: repeats foreverrepeat_n: repeats N number of times, where N is known at the time of constructionrepeat_until: repeats until the passed callable returns true
It seems that we might be able to construct a solution with either repeat_n or repeat_until.
Let’s try both and see what the differences are in ease-of-writing and generated code.
repeat_n
The repeat_n algorithm will run the sender N+1 times. It will repeat N times. So if we
wanted the chain to run 5 times we might have something like:
auto send_data = write_byte() | repeat_n(4);
Or maybe we could have:
auto send_data(auto iter, auto iter_end) {
auto d = std::distance(iter, iter_end);
return write_byte() | repeat_n(d);
}
This certainly looks like it will do the repeating job.
repeat_until
The repeat_until algorithm takes a predicate that will determine when to stop. Using
our previous example:
auto send_data(auto iter, auto iter_end) {
return write_byte() | repeat_until([&](){return ++iter == iter_end;});
}
This might seem nice at first glance but senders are lazily evaluated which means references
to iter and iter_end are dangling. We somehow need to get the state into the sender
chain so that it stays in scope.
Welcome to structured concurrency. As we build up a sender chain, we want to put the state inside the chain.
auto send_data(auto iter, auto iter_end) {
return let_value([iter, iter_end]() mutable {
return
write_byte()
| repeat_until([&](){return ++iter == iter_end;})
;
});
}
let_value takes a callable that will return a sender when called. The callable
is copied into the operation state when the sender connects to the receiver. This means
the callable will stay in scope during the lifetime of the sender chain execution.
In the above example, iter and iter_end are captured into the closure object which is
stored in the op-state of the let_value. repeat_until is referencing the captured data.
This is pretty nice and also a bigger hammer than we need right now. When the callable to
let_value takes no arguments it is usually an indicator that the less-powerful and lighter-weight
sequence can be used:
auto send_data(auto iter, auto iter_end) {
return sequence([iter, iter_end]() mutable {
return
write_byte()
| repeat_until([&](){return ++iter == iter_end;})
;
});
}
Another way to capture values in the sender chain is to use just.
auto send_data(auto iter, auto iter_end) {
return
just(iter, iter_end)
| let_value([](auto & iter, auto & iter_end) {
return
write_byte()
| repeat_until([&](){return ++iter == iter_end;})
;
})
;
}
Capturing the values in the just op-state is probably more canonical in the S/R world.
The resulting codegen and memory usage is basically the same.
Putting it Together
We now have some repeating, asynchronous functions but we haven’t sent any
data. Let’s build on the lighter-weight sequence variation to iterate through the
data.
auto send_data(auto iter, auto iter_end) {
return sequence([iter, iter_end]() mutable {
return
just()
| then([&iter]() -> std::uint8_t {
return *iter++;
})
| write_byte()
| wait_write_done()
| repeat_until([&iter, &iter_end]() {
return iter == iter_end;
})
;
});
}
sequence takes a callable that takes no arguments and returns a sender.
We capture the iter and iter_end within the closure object, as previously
discussed.
- The
justis a factory to get us going … we need a sender thenis an adapter that takes a callable. In our case, we don’t need anything from the value channel. We just captureiterby reference. We return the value the iterator is currently pointing at and increment the iterator. That returned value is in the value-channel.- Let’s assume
write_byte()is an adapter that returns a sender that extracts the value to write from the value channel and sticks that value into the hardware buffer. wait_write_doneis an adapter that returns a sender that will progress when the write value is done being shifted by the hardware. We are going to leave this as hand-wavey until the future post on interrupts (optionally… see any of my past conference talks on baremetal senders).repeat_untilthe iterator is at the end