Details of Pstreams Shared Memory Block Transfer

Commit 0e436a69 by Lennart Poettering introduced shared memory support to PulseAudio back in 2006.

From the commit log:

Rework memory management to allow shared memory data transfer. The central idea is to allocate all audio memory blocks from a per-process memory pool which is available as read-only SHM segment to other local processes. Then, instead of writing the actual audio data to the socket just write references to this shared memory pool.

To work optimally all memory blocks should now be of type PA_MEMBLOCK_POOL or PA_MEMBLOCK_POOL_EXTERNAL. The function pa_memblock_new() now generates memory blocks of this type by default.

So, how are audio memory blocks (and now even commands, after the srbchannel patch) sent and received using this new mechanism?

The entry point is our pa_pstream_send_memblock() method. From its name, a memblock is sent over the pstream to the other end of the connection:

void pa_pstream_send_memblock(pa_pstream*p, uint32_t channel, int64_t offset, pa_seek_mode_t seek_mode, const pa_memchunk *chunk) {
    size_t length, idx;
    size_t bsm;

    bsm = pa_mempool_block_size_max(p->mempool);

    while (length > 0) {
        struct item_info *i;
        size_t n;

        ...
        i->type = PA_PSTREAM_ITEM_MEMBLOCK;

        n = PA_MIN(length, bsm);
        i->chunk.index = chunk->index + idx;
        i->chunk.length = n;
        i->chunk.memblock = pa_memblock_ref(chunk->memblock);

        i->channel = channel;
        i->offset = offset;
        i->seek_mode = seek_mode;

        pa_queue_push(p->send_queue, i);

        idx += n;
        length -= n;
    }

    p->mainloop->defer_enable(p->defer_event, 1);
}

If you notice, pstreams supports sending multiple types of "items", and they are:

struct item_info {
    enum {
        PA_PSTREAM_ITEM_PACKET,
        PA_PSTREAM_ITEM_MEMBLOCK,
        PA_PSTREAM_ITEM_SHMRELEASE,
        PA_PSTREAM_ITEM_SHMREVOKE
    } type;
    ...
}

In our parent article, What are Pstreams?, we've seen how PACKET items get handled. Here we conern ourselves with MEMBLOCK ones.

As noted in that document, the final actual writing step of pstreams is do_write(). Before actual writing using this method, the method code calls prepare_next_write_item(). That method prepares all the needed data that needs to be written over the pipe.

The whole logic of shared memory blocks send and receive is encapsulated in that method. We can see as follows:

static void prepare_next_write_item(pa_pstream *p) {
    pa_assert(p);
    pa_assert(PA_REFCNT_VALUE(p) > 0);

    /* [Book Note]: What was pushed in the queue earlier */
    p->write.current = pa_queue_pop(p->send_queue);

    ...

    if (p->write.current->type == PA_PSTREAM_ITEM_PACKET) {
        ...
    } else if (p->write.current->type == PA_PSTREAM_ITEM_SHMRELEASE) {
        ...
    } else if (p->write.current->type == PA_PSTREAM_ITEM_SHMREVOKE) {
        ...
    } else {
        uint32_t flags;
        bool send_payload = true;

        pa_assert(p->write.current->type == PA_PSTREAM_ITEM_MEMBLOCK);
        pa_assert(p->write.current->chunk.memblock);
        ...
    }
}

As we can see from above, the item to be written is popped from the send queue and a lot of steps are done based on the item type. Our point of interest is an ITEM_MEMBLOCK, and how is it handled! We've cut the code for brevity, but we include it in full below:

        bool send_payload = true;
        pa_assert(p->write.current->type == PA_PSTREAM_ITEM_MEMBLOCK);
        pa_assert(p->write.current->chunk.memblock);

        ...
        if (p->use_shm) {
            uint32_t block_id, shm_id;
            size_t offset, length;
            uint32_t *shm_info = (uint32_t *) &p->write.minibuf[PA_PSTREAM_DESCRIPTOR_SIZE];
            size_t shm_size = sizeof(uint32_t) * PA_PSTREAM_SHM_MAX;
            pa_mempool *current_pool = pa_memblock_get_pool(p->write.current->chunk.memblock);
            pa_memexport *current_export;

            ...
            if (pa_memexport_put(current_export,
                                 p->write.current->chunk.memblock,
                                 &block_id,
                                 &shm_id,
                                 &offset,
                                 &length) >= 0) {

                flags |= PA_FLAG_SHMDATA;
                if (pa_mempool_is_remote_writable(current_pool))
                    flags |= PA_FLAG_SHMWRITABLE;
                send_payload = false;

                shm_info[PA_PSTREAM_SHM_BLOCKID] = htonl(block_id);
                shm_info[PA_PSTREAM_SHM_SHMID] = htonl(shm_id);
                shm_info[PA_PSTREAM_SHM_INDEX] = htonl((uint32_t) (offset + p->write.current->chunk.index));
                shm_info[PA_PSTREAM_SHM_LENGTH] = htonl((uint32_t) p->write.current->chunk.length);

                p->write.descriptor[PA_PSTREAM_DESCRIPTOR_LENGTH] = htonl(shm_size);
                p->write.minibuf_validsize = PA_PSTREAM_DESCRIPTOR_SIZE + shm_size;
            }
            ...
        }

Note that the code fills the descriptor fields of the write descriptor p->write.descriptor. This is required when preparing any type of item, including ITEM_MEMBLOCK. That descriptor includes basic details like the "length" of data to be written and any flags.

How is the descriptor's length field set?

In the write descriptor, the "length" of data accompanying the descriptor is written in the descriptor field descriptor[PA_PSTREAM_DESCRIPTOR_LENGTH].

When we're not sending full audio blocks, but rather just their SHM references, the length becomes the length of the "SHM descriptor" shm_info. That info is a very simple 5 integers array containing the SHM ID, the length of referenced data within that SHM reference, etc.

That's why when sending SHM references instead of actual audio data, we have:

    size_t shm_size = sizeof(uint32_t) * PA_PSTREAM_SHM_MAX;
    p->write.descriptor[PA_PSTREAM_DESCRIPTOR_LENGTH] = htonl(shm_size);

But if full block's audio data are to be written instead of just references, we have:

    if (send_payload) {
        p->write.descriptor[PA_PSTREAM_DESCRIPTOR_LENGTH] = htonl((uint32_t) p->write.current->chunk.length);
        p->write.memchunk = p->write.current->chunk;
        pa_memblock_ref(p->write.memchunk.memblock);
    }

and it's clear that the length field put in the descriptor is the length of the actual audio data itself.

What is send_payload, and why is it important?

The send_payload variable is a critical one in the pstreams code. If this variable is set, we don't just send the block's SHM reference, but the block's actual audio data itself over the pipe. This is very critical:

  • If we're sending a SHM block, but the pipe does not support SHM transfer (p->use_shm is false), then the code keeps setting send_payload to true falls back to sending the full audio data over the pipe
  • If we're sending a SHM block, and the pipe supports SHM transfer, then send_payload is set to false and the critical PA_FLAG_SHMDATA got set. That flag implies to the other endpoint that we're sending a reference instead of the full audio data.

What is the "shm reference"?

The word "shm reference" has been said multiple times, including in Lennart's original commit above. That "reference" is the data the other PA endpoint needs to succesfully open the SHM area.

In case of POSIX SHM, that reference is mainly the shm_id variable. If the ID is XXXX, then the other end do a shm_open() over the /dev/shm/pulse-shm-XXXX SHM file. This is actually one of the most important fields to send to the other end.

Information extracted from the SHM mempool are then saved on the shm_info 5 integer area; the 5 integers include BLOCKID, SHMID, INDEX and LENGTH. These four fields are all what the other end needs to open, mmap, and read the passed memory contents.

How is information extracted from the to-be-sent block?

As we've seen above, to send a full "SHM reference", the pstreams code need to extract a lot information from the block to be sent (as a refernce). The method repspsonsible for memblocks' data extraction resides outside of the pstreams code, at memblock.c, and is called pa_memexport_put.

We can see it in full action below:

int pa_memexport_put(pa_memexport *e, pa_memblock *b, uint32_t *block_id, uint32_t *shm_id, size_t *offset, size_t * size) {
    ...
    *block_id = (uint32_t) (slot - e->slots + e->baseidx);
    ...
    *shm_id = memory->id;
    *offset = (size_t) ((uint8_t*) data - (uint8_t*) memory->ptr);
    *size = b->length;
    ...
    return 0;
}

How a pstream decides to support, or not, SHM transfers?

This is a very simple operation. When the client connects to the server, they both send version numbers to each other. The most-significant-bit of the version flags decide if each other end supports pstreams SHM transfer.

When a PA endpoint finds that it supports SHM transfer and the other endpoint does so too, it simply calls ps_pstram_enable_shm(). That method mainly sets the p->use_shm boolean to true:

void pa_pstream_enable_shm(pa_pstream *p, bool enable) {
    pa_assert(p);
    pa_assert(PA_REFCNT_VALUE(p) > 0);

    p->use_shm = enable;
    ...
}

But why is this flag important? Remember when the pstream was sending the memblock above? If p->use_shm is set, it will just send "SHM references" to other end instead of the full blocks' audio data. While doing so, it will set PA_FLAG_SHMDATA so the other end knows that we're just sending SHM references instead of full audio data.

So actually without this flag set to true, Pulse will send all blocks as full audio data copy over the socket, even for blocks backed by SHM pools. This makes sense... If the other end does not support SHM transfers, then things would break if we just sent SHM references instead of the full audio data itself.

results matching ""

    No results matching ""