-
-
Notifications
You must be signed in to change notification settings - Fork 16
Enhance texture format probing for macOS support #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jensroth-git
commented
Aug 19, 2025
- Updated r3d_pass_final_blit to use render size for HiDPI/Retina support.
- Introduced r3d_try_internal_format function to probe texture formats and updated texture support checks in r3d_support_check_texture_internal_formats.
- Fixed custom frame buffer height calculation
- Added background color clearing in the Draw function of examples/dof.c. - Updated r3d_pass_final_blit to use render size for HiDPI/Retina support. - Introduced r3d_try_internal_format function to probe texture formats and updated texture support checks in r3d_support_check_texture_internal_formats.
|
Just tested on Windows and Mac, unfortunately can't test on Linux |
|
For the blit, thanks for the fix! I think I’ll switch to a simple “copy” via shader anyway, since blit causes quite a few other problems… These two cases (format check and blit) are still bothering me, I’ll think about what can be done. I'm merging and will test and refine this right away. Thanks! |
|
@jensroth-git The internal format support verification code has been revised, I decided to simplify everything by following your approach. We now check both format support and whether it can be used as a framebuffer attachment. Normally, if everything still works, then nothing is broken, but if you could confirm that, it would be great. The commit is here: a6a2c8a |
|
it works great on macOS! |
|
I should note though, that on macOS, in the examples when using GetScreenWidth/Height for R3D_Init, it is only half native res, you need to use GetRenderWidth/Height. Regardless of using the HighDPI flags for setting up the raylib window... I have not looked into how Raylib handles/doesn't handle this yet. |
|
Thanks, I’ll look into that. However, I don’t have a Mac to test on, so if I do something it’ll be a shot in the dark x) |
|
I think its really a raylib issue, not something r3d should concern itself with? so screen width is not even supposed to handle high dpi anyway? I'll have to check tomorrow what values ScreenWidth/Height gives for different dpi and compare with macOS. |
Yes, it seems to me that I’ve already seen this topic quite a few times as an issue on raylib. As for the examples, I’m not sure what visual impact it has, if it’s really blurry or pixelated, then yes, otherwise leave it as is. Apart from the blit, r3d shouldn’t be handling the rest anyway. |
|
well they are rendering at half res, quite noticeable can I get your input on something? unless you have a better option? heres my current implementation // Profile this GPU Section
R3D_PROF_ZONE_GPU("DOF Pass") {
glBindFramebuffer(GL_FRAMEBUFFER, R3D.framebuffer.pingPong.id);
{
glViewport(0, 0, R3D.state.resolution.width, R3D.state.resolution.height);
r3d_framebuffer_swap_pingpong(R3D.framebuffer.pingPong);
r3d_shader_enable(screen.dof);
{
// ...
}
r3d_shader_disable();
}
}to get the zone values later there is double R3D_ProfGetGPUZoneMS(const char *zoneName, int samplesAverage);
// usage to get last 64 frame average
char profText[64];
snprintf(profText, sizeof(profText), "DoF: %.2f", R3D_ProfGetGPUZoneMS("DOF Pass", 64));
DrawText(profText, 10, 10, 20, WHITE);Maybe I'm overengineering, but it seems quite usable 🤔 |
|
Oh yes, I thought of that too, actually it would be nice to have something like that beforehand. Your example looks really good, I have nothing to add, as long as we can set up a precise test in under a minute, that would be perfect. However, how do you plan to profile the GPU? Afaik there are queries available in GL 3.3, By the way, in your example, I’m not sure if you plan to store the results by name or otherwise, but if it’s only for internal testing, we could simplify things by using a It’s just a quick thought, this would need to be carefully verified. |
|
heres the current implementation |
|
I pushed the pre optimization state with the current profiler onto my fork, if you want to take a look |
|
I read your example carefully, and if I understood the goal, everything could be summed up in a simple macro that would do everything locally: #define GPU_PROFILE_BLOCK(Name, NSAMPLES, CodeBlock) \
do { \
static GLuint query = 0; \
static double hist[NSAMPLES] = {0}; \
static int count = 0, index = 0; \
\
if (!query) glGenQueries(1, &query); \
\
glBeginQuery(GL_TIME_ELAPSED, query); \
do { CodeBlock } while(0); \
glEndQuery(GL_TIME_ELAPSED); \
\
GLuint64 ns = 0; \
glGetQueryObjectui64v(query, GL_QUERY_RESULT, &ns); \
double ms = ns / 1e6; \
\
hist[index] = ms; \
index = (index + 1) % NSAMPLES; \
if (count < NSAMPLES) count++; \
\
if (count == NSAMPLES) { \
double sum = 0.0; \
for (int i = 0; i < NSAMPLES; i++) sum += hist[i]; \
printf("[GPU] %s avg(%d) = %.3f ms\n", Name, NSAMPLES, sum/NSAMPLES); \
count = 0; \
} \
} while(0)The idea is that you can name the part via 'Name', 'NSAMPLES' is the number of samples you want before getting the average, and the CodeBlock is the code you want to profile Here's how it can be used: GPU_PROFILE_BLOCK("DoSomething", 32, {
glDoSomething();
});Every 32 runs this will give an average in this form: [GPU] DoSomething avg(32) = 0.001 msI made an example with SDL if you want to try it: https://gist.github.com/Bigfoot71/ac652f10e73b364a5fd5a3b99ef590b3 |
|
That is much simpler indeed, we could even add code to aggregate into a public profiler later, just not sure if blocking the cpu is going to give incorrect results, or if we should make it work with GL_QUERY_RESULT_AVAILABLE and polling like my version did it? |
No, it won’t give incorrect results, in fact it can even be a bit more precise. What it really tells the CPU is "wait until the GPU has finished the operation and give me the result right away once it’s done" Overall it will slow the program down since the CPU is blocked, but the GPU which runs in parallel will execute the operation as usual. The GPU behavior may change slightly at the end because of the stall, but the operation itself is still measured correctly.
But that would require a more complex system, one that doesn’t simply stall the CPU each time. And since all GPU operations are actually "flushed" at R3D_End, you would need to measure each pass globally and make the results accessible. At that point you are at the boundary between a rendering framework and a full engine, so I’m not entirely sure what would be the most relevant way to expose it. I’m also not sure such GPU timing should be included in release builds. Either it’s only exposed in debug builds (though not many developers will necessarily compile R3D in debug when making a game), or it would need to be a more advanced system that is dynamic and adds virtually no overhead when not active. Most engines only include their profiler in the editor and simply don’t ship it in the final release build, which makes things "simpler" on that side... |
|
Just a note, the solution I proposed remains a quick solution when developing a feature and was not intended to be included publicly, the macro does not even release the generated query... So yes, your system could be included, but I’m not sure yet about the right way to do it. And some operations are currently difficult to profile accurately (globally), such as draw calls, and that’s really due to the internal design... |