EihiS

July 3, 2015

rPI B+ : dispmanx resources , VC GPU, and mailboxes hacks

Filed under: Raspberry 3.14 — Tags: , , , , , , , , , , — admin @ 3:03 pm

This post is provided ‘as is’ etc.
Primary need : get access to GPU allocated memory blocks to be able to share between 2 different apps ( namely SDL2 / dispman_x resource based app / and openGLES2.0 based.

program VCDBG can be found at /opt/vc/bin

(sudo) vcdbg –help for a list of commands.

//

Let’s assume a vc_dispmanx_resource was created by an app ( still runing OR NOT - app stopped, and resource not cleaned up  )

#vcdbg reloc | grep dispmanx_resource

could give result like :

[  30] 0x10a71a20: used 240K (refcount 2 lock count 8, size   245760, align   32
, data 0x10a71a40, d1rual) 'dispmanx_resource'

for this particular result, the format of the resource used for the text is known to be : 512×160 pixels, RGB888 surface..

The size matches ( 512×160x3 = 245760 ).This gpu memory bloc is the one of the dispmanx resource.

NOTE : ** the [ 30] at the start of the line is the VC_MEM_HANDLE number **

using vcdbg, one can dump and create a file from this GPU memory area , by doing :

#vcdbg save /path_to_folder/filename.raw  0x10a71a40 245760
** raw notes following **

and from : https://github.com/raspberrypi/userland/blob/master/interface/vmcs_host/vc_vchi_dispmanx.c

#ifdef SELF_HOSTED
VCHPRE_ int VCHPOST_ vc_dispmanx_resource_write_data_handle( DISPMANX_RESOURCE_HANDLE_T handle, VC_IMAGE_TYPE_T src_type /* not used */,
int src_pitch, VCHI_MEM_HANDLE_T mem_handle, uint32_t offset,
const VC_RECT_T * rect ) {
int32_t bulk_len;
uint32_t param[3];
uint32_t success = 0;
//Note that x coordinate of the rect is NOT used
//Address of data in host
   offset += src_pitch * rect->y;
   bulk_len = src_pitch * rect->height;
//Now send the bulk transfer across
//command parameters: resource handle, destination y, bulk length
   param[0] = VC_HTOV32(handle);
   param[1] = VC_HTOV32(rect->y);
   param[2] = VC_HTOV32(bulk_len);
   success = dispmanx_send_command(  EDispmanBulkWrite | DISPMANX_NO_REPLY_MASK, param, sizeof(param));
if(success == 0)
   {
lock_obtain();
      success = vchi_bulk_queue_transmit_reloc( dispmanx_client.client_handle[0],
                                                mem_handle, offset,
                                                bulk_len,
                                                VCHI_FLAGS_BLOCK_UNTIL_DATA_READ,
NULL );
lock_release();
   }
return (int) success;
}
#endif

looks possible to move GPU to GPU memory blocs. the idea here is to use the function to switch gpu bitmaps from an app to another without the extra move from GPU to the arm.
Moving gpu to gpu *should* be faster .

– and raw notes –

19-09-2015 update:

After many try and fails to recompile the libraries with “SELF_HOSTED” on, i have succesfully tested a different approach for the problem :

Successfull test were made with switching a dispmanx’s element mem_handle reference to 2 different gpu memory areas.

Successfull tests were also made with direct access to the GPU’s bitmap memory blocs.
It implicates many manipulations ,including Build , send and receive mailboxes (”get_mem_handle, mem_lock/mem_unlock ..”) , memmap’s open/close to access the GPU datas from the arm side (read/write or modify ) .

From the SDL2.0 point of view, a SDL texture created with RGBA8888 properties, is stored into the GPU side with format VC_IMAGE_TF_RGBA32 so, if one needs to modify a dispmanx resource’s memory handle value, the dispmanx resource has to be created with this type, and the things will be easier (correct displayed datas without extra bytes conversion )

Also, due to the 32bits align into the GPU memory, it’s easier to choose powers of two for the bitmaps sizes , and 4 bytes for colors (RGBA) because the width * height * bytesperpixel final result will generate 32-ready-aligned total sizes, easy to found into the “vcdbg reloc” results list.

– popcornmix wrote :

“vc_dispmanx_resource_get_image_handle(res)” returns a (GPU) physical address containing a VC_IMAGE_T structure (the third and fourth shorts contain width and height).
There is a mem_handle in this structure that can be passed to the mailbox lock function to get a physical address to the memory.

this is the struct the function ’s gives a pointer to :

VC_IMAGE_T , from vc_image.h

struct VC_IMAGE_T {
#ifdef __HIGHC__
VC_IMAGE_TYPE_T type; /* Metaware will use 16 bits for this enum
so use the correct type for debug info */
#else
unsigned short type; /* should restrict to 16 bits */
#endif
VC_IMAGE_INFO_T info; /* format-specific info; zero for VC02 behaviour */
unsigned short width; /* width in pixels */
unsigned short height; /* height in pixels */
int pitch; /* pitch of image_data array in bytes */
int size; /* number of bytes available in image_data array */
void *image_data; /* pixel data */
VC_IMAGE_EXTRA_T extra; /* extra data like palette pointer */
VC_METADATA_HEADER_T *metadata; /* metadata header for the image */
struct opaque_vc_pool_object_s *pool_object; /* nonNULL if image was allocated from a vc_pool */
MEM_HANDLE_T mem_handle; /* the mem handle for relocatable memory storage */
int metadata_size; /* size of metadata of each channel in bytes */
int channel_offset; /* offset of consecutive channels in bytes */
uint32_t video_timestamp;/* 90000 Hz RTP times domain - derived from audio timestamp */
uint8_t num_channels; /* number of channels (2 for stereo) */
uint8_t current_channel;/* the channel this header is currently pointing to */
uint8_t linked_multichann_flag;/* Indicate the header has the linked-multichannel structure*/
uint8_t is_channel_linked; /* Track if the above structure is been used to link the header
into a linked-mulitchannel image */
uint8_t channel_index; /* index of the channel this header represents while
it is being linked. */
uint8_t _dummy[3]; /* pad struct to 64 bytes */
};

To modify the data inside it, one needs to memmap the returned pointer to an arm-side pointer.
Then the datas can be modifed/read etc.
From my tests, i’ve found that the modifications to this struct are not taken into account ready when done.
This structs can be listed using “vcdbg malloc” , sized 64 bytes like the definition listed before is.

For the dispmanx element to be updated once some modifications to the strcuture datas has been made, the only way i’ve found is to call a ‘dispmanx_resource_change’ function ( to a ‘fake’ other resource), then recall the same function to make it use the same, original resource. that way, the datas of the structure are reloaded, and the changes you made are applied.
//

One can also need to access to the GPU’s side bitmap datas.
To do this, get the VC_MEM_HANDLE value from the structure,

Then use the mailbox 0×30014 message to get the GPU’s pointer for this MEM_HANDLE. once the mailbox sends back the pointer,  (you can compare it with the (”vcdbg reloc” results))

You have to lock the GPU’s memory ( mailbox : memory_lock (handle) )

Then modify/read/write the datas thru a memmap’d pointer to the GPU side bitmap.

Then mailbox a ‘mem_unlock(handle)’

..and you’re done.

// and the discussion ’s related function added ( TAG 0×00030014 , Get dispmanx resource mem handle ) at  https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface

//

//// hot-note, SDL related, SDL_Texture structure


50 {
51  const void *magic; 4 = 15 d1 f4 b6
52  Uint32 format; // 4 = 04 18 16 16
53  int access; // 4 = 00 00 00 00
54  int w; // 4 = 00 00 00 c8
55  int h; // 4 = 00 00 00 c8
56  int modMode; // 4 = 00 00 00 00
57  SDL_BlendMode blendMode; // 6 = 00 00 00 00 00 00
58  Uint8 r, g, b, a; // 4 = ff ff ff ff
60  SDL_Renderer *renderer; // 32bit pointer 4 = 18 3b a0 b5
61
62  /* Support for formats not supported directly by the renderer */
63  SDL_Texture *native; // 32bit arm pointer 4 = 00 00 00 00
64  SDL_SW_YUVTexture *yuv; // 4 = 00 00 00 00
65  void *pixels;
66  int pitch;
68
69  void *driverdata;
73 };
// hotnote , SDL_Texture* datas
15 D1 F4 B6 04 18 16 16 00 00 00 00 C8 00 00 00 // c8 = 200 = width
C8 00 00 00 00 00 00 00 00 00 00 00 FF FF FF FF // c8 = 200 = height
18 3B A0 B5 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 28 42 A0 B5 00 00 00 00 00 00 00 00
00 00 00 00 25 00 00 00 01 00 00 00 E1 0D 00 00
08 19 00 00 01 14 00 00 00 00 00 00 00 00 00 00
00 00 00 00 35 00 00 00 00 00 00 00 00 00 00 00
20 04 00 00 00 00 FF 00 00 FF 00 00 FF 00 00 00
00 00 00 00 00 00 00 08 10 08 00 00 00 00 00 00
50 1F A0 B5 45 00 00 00 00 00 00 00 00 00 00 00
C8 00 00 00 C8 00 00 00 20 03 00 00 08 E0 B5 B5
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 C8 00 00 00 C8 00 00 00 00 00 00 00
00 00 00 00 89 00 00 00 40 00 A0 B5 40 00 A0 B5
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
// texture_magic macro :
#define CHECK_TEXTURE_MAGIC(texture, retval) \
    if (!texture || texture->magic != &texture_magic) { \
        SDL_SetError("Invalid texture"); \
        return retval; \
    }

// mailboxes and co , relevant cut’n'pastes
//
// github, hello_pi hello_fft : mailbox.h , mailbox.c

opt/vc/src/hello_pi/hello_fft/mailbox.c
opt/vc/src/hello_pi/hello_fft/mailbox.h
//

With the exception of the property tags mailbox channel, when passing memory addresses as the data part of a mailbox message, the addresses should be bus addresses as seen from the VC. These vary depending on whether the L2 cache is enabled. If it is, physical memory is mapped to start at 0×40000000 by the VC MMU; if L2 caching is disabled, physical memory is mapped to start at 0xC0000000 by the VC MMU. Returned addresses (both those returned in the data part of the mailbox response and any written into the buffer you passed) will also be as mapped by the VC MMU. In the exceptional case when you are using the property tags mailbox channel you should send and receive physical addresses (the same as you’d see from the ARM before enabling the MMU).

For example, if you have created a framebuffer description structure in memory (without having enabled the ARM MMU) at 0×00010000 and you have not changed config.txt to disable the L2 cache, to send it to channel 1 you would send 0×40010001 (0×40000000 | 0×00010000 | 0×1) to the mailbox. Your structure would be updated to include a framebuffer address starting from 0×40000000 (e.g. 0×4D385000) and you would write to it using the corresponding ARM physical address (e.g. 0×0D385000).

From the above :

volatile unsigned int mailbox[100] __attribute__ ((aligned(16))); // align16, 100 uint32_t for tags etc

then calling a func like :

mailbox_write((uint32_t) (mailbox+0x40000000), channel); // L2 cache on, else +0xC0000000 for L2 disabled
314159265358979323846264338327950288
419716939937510582097494459230781640
628620899862803482534211706798214808

cat{ } { post_696 } { } 2009-2015 EIhIS Powered by WordPress